0

I have a dataframe with a column containing a list of strings 'A:B'. I'd like to modify this so there is a new column which contains a set split by ':' containing the first element.

data = [
    {'Name': 'A', 'Servers':['A:s1', 'B:s2', 'C:s3', 'C:s2']},
    {'Name': 'B', 'Servers':['B:s1', 'C:s2', 'B:s3', 'A:s2']},
    {'Name': 'C', 'Servers':['G:s1', 'X:s2', 'Y:s3']} 
]

df = pd.DataFrame(data)
df

df['Clusters'] = [
    {'A', 'B', 'C'},
    {'B', 'C', 'A'},
    {'G', 'X', 'Y'}
]
Evan Brittain
  • 547
  • 5
  • 15

1 Answers1

1

Learn how to use apply

  In [5]: df['Clusters'] = df['Servers'].apply(lambda x: {p.split(':')[0] for p in x})                                                                                  

  In [6]: df                                                                                                                                                         
  Out[6]: 
    Name                   Servers   Clusters
  0    A  [A:s1, B:s2, C:s3, C:s2]  {A, B, C}
  1    B  [B:s1, C:s2, B:s3, A:s2]  {C, B, A}
  2    C        [G:s1, X:s2, Y:s3]  {X, Y, G}
Karthik V
  • 1,867
  • 1
  • 16
  • 23