2

I have a pandas dataframe containing strings:

df = pd.DataFrame({'column1': ['One_Two_Three', 'First_Second_Third', 'nrOne_nrTwo_nrThree'], 'column2': ['nrOne_nrTwo_nrThree', 'First_Second_Third', 'One_Two_Three'], 'column3': ['First_Second_Third', 'One_Two_Three', 'nrOne_nrTwo_nrThree'],})
Out[0]: df 
               column1              column2              column3
0        One_Two_Three  nrOne_nrTwo_nrThree   First_Second_Third
1   First_Second_Third   First_Second_Third        One_Two_Three
2  nrOne_nrTwo_nrThree        One_Two_Three  nrOne_nrTwo_nrThree

I would like to end up with three dataframes, so that the first one contain the characters before the first underscore, the second one before the second underscore and the third contain the last part. For the first like:

    df_one
    Out[1]: 
               column1              column2              column3
0              One                  nrOne                First
1              First                First                One
2              nrOne                One                  nrOne

I've tried

df_temp = df.apply(lambda x: x.str.split('_'))

df_temp
Out[2]: 
                   column1                  column2                  column3
0        [One, Two, Three]  [nrOne, nrTwo, nrThree]   [First, Second, Third]
1   [First, Second, Third]   [First, Second, Third]        [One, Two, Three]
2  [nrOne, nrTwo, nrThree]        [One, Two, Three]  [nrOne, nrTwo, nrThree]

To split it into lists and

df_temp.apply(lambda x: x[0])
Out[3]: 
  column1  column2 column3
0     One    nrOne   First
1     Two    nrTwo  Second
2   Three  nrThree   Third

But this ends up affecting only the first row. Anyone who have a solution?

Michael Szczesny
  • 4,911
  • 5
  • 15
  • 32
DHJ
  • 611
  • 2
  • 13
  • This [Apply pandas function to column to create multiple new columns?](https://stackoverflow.com/questions/16236684/apply-pandas-function-to-column-to-create-multiple-new-columns) might help. Moreover, I believe that `df["column1"].apply(lambda s: pd.Series(s.split("_")))` should return many columns. – Felipe Whitaker Dec 21 '21 at 15:00
  • Thanks for the answer. I found that using pandas applymap instead of apply works well in my case – DHJ Dec 21 '21 at 15:08

1 Answers1

1

One solution is to use applymap:

df_temp.applymap(lambda x: x[0])
Out[0]: 
  column1 column2 column3
0     One   nrOne   First
1   First   First     One
2   nrOne     One   nrOne

Another is to use apply on a Series, by stacking and unstacking:

df_temp.stack().apply(lambda x: x[0]).unstack()
Out[0]: 
  column1 column2 column3
0     One   nrOne   First
1   First   First     One
2   nrOne     One   nrOne
DHJ
  • 611
  • 2
  • 13