2

I have a df with three columns. For two of these columns some rows do contain a comma like here:

In [40]: df_given=pd.DataFrame([['bla', 'A,B', '1,2'],['bla','C,D','45,34'],['bla','A','3']])

In [41]: df_given
Out[41]:
     0    1      2
0  bla  A,B    1,2
1  bla  C,D  45,34
2  bla    A      3

For the rows with commas, I want to have two rows with the value either in front or behind the comma:

In [42]: df_wanted=pd.DataFrame([['bla', 'A', '1'],['bla', 'B', '2'],['bla','C','45'],['bla','D','34'],['bla','A','3']])

In [43]: df_wanted
Out[43]:
     0  1   2
0  bla  A   1
1  bla  B   2
2  bla  C  45
3  bla  D  34
4  bla  A   3

I thought about copying the comma-rows and to either lstrip or rstrip them. But I have no idea how to differentiate between the two copies. Does anyone have an idea?

I'm not sure if my approach is the best one. Since my files are big I would appreciate a less memory consuming solution.

SGeuer
  • 147
  • 1
  • 8

0 Answers0