How to split every string in a list in a dataframe column

Question

I have a dataframe with a column containing a list of strings 'A:B'. I'd like to modify this so there is a new column which contains a set split by ':' containing the first element.

data = [
    {'Name': 'A', 'Servers':['A:s1', 'B:s2', 'C:s3', 'C:s2']},
    {'Name': 'B', 'Servers':['B:s1', 'C:s2', 'B:s3', 'A:s2']},
    {'Name': 'C', 'Servers':['G:s1', 'X:s2', 'Y:s3']} 
]

df = pd.DataFrame(data)
df

df['Clusters'] = [
    {'A', 'B', 'C'},
    {'B', 'C', 'A'},
    {'G', 'X', 'Y'}
]

What do you want the results to look like? What have you tried? — piRSquared, Jul 04 '19 at 20:56
[this](https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe) is a good start. — Quang Hoang, Jul 04 '19 at 20:58
it should be the same dataframe with the column 'Clusters' added. 'Clusters' contains a set of the first element from 'Servers' split at ':'. — Evan Brittain, Jul 04 '19 at 21:02

score 1 · Accepted Answer · answered Jul 04 '19 at 21:02

Learn how to use apply

  In [5]: df['Clusters'] = df['Servers'].apply(lambda x: {p.split(':')[0] for p in x})                                                                                  

  In [6]: df                                                                                                                                                         
  Out[6]: 
    Name                   Servers   Clusters
  0    A  [A:s1, B:s2, C:s3, C:s2]  {A, B, C}
  1    B  [B:s1, C:s2, B:s3, A:s2]  {C, B, A}
  2    C        [G:s1, X:s2, Y:s3]  {X, Y, G}

How to split every string in a list in a dataframe column

1 Answers1