Suppose we have a Pytorch distributed group object that initialized by torch.distributed.new_group([a,b,c,d])
, is there any way to get the global ranks a,b,c,d
from this group?
Asked
Active
Viewed 796 times
2

desertnaut
- 57,590
- 26
- 140
- 166

Qin Heyang
- 1,456
- 1
- 16
- 18
1 Answers
0
Pytorch offers an torch.distributed.distributed_c10d._get_global_rank
function can be used in this case:
import torch.distributed as dist
def get_all_ranks_from_parallel_group(group):
rank=0
results=[]
try:
while True:
results.append(dist.distributed_c10d._get_global_rank(group, rank))
rank+=1
except RuntimeError:
pass
return results

Qin Heyang
- 1,456
- 1
- 16
- 18