2

Suppose we have a Pytorch distributed group object that initialized by torch.distributed.new_group([a,b,c,d]), is there any way to get the global ranks a,b,c,d from this group?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Qin Heyang
  • 1,456
  • 1
  • 16
  • 18

1 Answers1

0

Pytorch offers an torch.distributed.distributed_c10d._get_global_rank function can be used in this case:

import torch.distributed as dist
def get_all_ranks_from_parallel_group(group):
    rank=0
    results=[]
    try:
        while True:
            results.append(dist.distributed_c10d._get_global_rank(group, rank))
            rank+=1
    except RuntimeError:
        pass
    return results
Qin Heyang
  • 1,456
  • 1
  • 16
  • 18