utils.third_party package
Submodules
utils.third_party.ddp_functional_utils module
- utils.third_party.ddp_functional_utils.broadcast(tensor, src, group=None)[source]
Broadcasts the tensor to the whole group.
tensor
must have the same number of elements in all processes participating in the collective.- Parameters:
tensor (Tensor) – Data to be sent if
src
is the rank of current process.src (int) – Source rank.
group (ProcessGroup, optional) – The process group to work on.
- Returns:
Received tensor from the broadcast op.
- Return type:
Tensor
- utils.third_party.ddp_functional_utils.gather(tensor, dst=0, group=None)[source]
Gathers a list of tensors in a single process.
- Parameters:
tensor (Tensor) – Input tensor.
dst (int, optional) – Destination rank (default is 0).
group (ProcessGroup, optional) – The process group to work on.
- Returns:
List of appropriately-sized tensors with the gathered data.
- Return type:
tuple[Tensor]
- utils.third_party.ddp_functional_utils.scatter(tensors, src=0, group=None)[source]
Scatters a list of tensors to all processes in a group.
Each process will receive exactly one tensor and store its data in the
tensor
argument.- Parameters:
tensors (list[Tensor]) – List of tensors to scatter on the source rank. Receivers must pass ``None`.
src (int, optional) – Source rank (default is 0).
group (ProcessGroup, optional) – The process group to work on.
- Returns:
Output tensor from the scatter operation.
- Return type:
Tensor
- utils.third_party.ddp_functional_utils.reduce(tensor, dst, op=<torch.distributed.distributed_c10d.ReduceOp object>, group=None)[source]
Reduces the tensor data across all machines.
Only the process with rank
dst
is going to receive the final result.- Parameters:
tensor (Tensor) – Input of the collective.
dst (int) – Destination rank.
op (optional) – One of the values from
torch.distributed.ReduceOp
enum. Specifies an operation used for element-wise reductions.group (ProcessGroup, optional) – The process group to work on.
- Returns:
Output of the collective.
- Return type:
Tensor
- utils.third_party.ddp_functional_utils.reduce_scatter(output, input_list, op=<torch.distributed.distributed_c10d.ReduceOp object>, group=None)[source]
Reduces, then scatters a list of tensors to all processes in a group.
- Parameters:
output (Tensor) – Output tensor.
input_list (list[Tensor]) – List of tensors to reduce and scatter.
op (optional) – One of the values from
torch.distributed.ReduceOp
enum. Specifies an operation used for element-wise reductions.group (ProcessGroup, optional) – The process group to work on.
- Returns:
Output of the collective.
- Return type:
Tensor
- utils.third_party.ddp_functional_utils.all_gather(tensor, group=None)[source]
Gathers tensors from the whole group in a list.
- Parameters:
tensor (Tensor) – Tensor to be broadcast from current process.
group (ProcessGroup, optional) – The process group to work on.
- Returns:
Output of the collective.
- Return type:
tuple([Tensor])
- utils.third_party.ddp_functional_utils.all_to_all(output_tensor_list, input_tensor_list, group=None)[source]
Each process scatters list of input tensors to all processes in a group and return gathered list of tensors in output list.
- Parameters:
out_tensor_list (list[Tensor]) – list of tensors to gather one per rank.
input_tensor_list (list[Tensor]) – List of tensors to scatter one per rank.
group (ProcessGroup, optional) – The process group to work on.
- Returns:
Output of the collective.
- Return type:
tuple([Tensor])
- utils.third_party.ddp_functional_utils.all_to_all_single(output, input, output_split_sizes=None, input_split_sizes=None, group=None)[source]
Each process splits input tensor and then scatters the split list to all processes in a group. Then concatenate the received tensors from all the processes in the group and return single output tensor.
- Parameters:
output (Tensor) – Gathered cancatenated output tensor.
input (Tensor) – Input tensor to scatter.
output_split_sizes – (list[Int], optional): Output split sizes for dim 0 if specified None or empty, dim 0 of
output
tensor must divide equally byworld_size
.input_split_sizes – (list[Int], optional): Input split sizes for dim 0 if specified None or empty, dim 0 of
input
tensor must divide equally byworld_size
.
- Returns:
Output of the collective.
- Return type:
Tensor
- utils.third_party.ddp_functional_utils.all_reduce(tensor, op=<torch.distributed.distributed_c10d.ReduceOp object>, group=None)[source]
Reduces the tensor data across all machines in such a way that all get the final result.
After the call the returned tensor is going to be bitwise identical in all processes.
- Parameters:
tensor (Tensor) – Input of the collective.
op (optional) – One of the values from
torch.distributed.ReduceOp
enum. Specifies an operation used for element-wise reductions.group (ProcessGroup, optional) – The process group to work on.
- Returns:
Output of the collective
- Return type:
Tensor