Based on this SO question, I would like split my dataframe based on column var1
. However, I have no delimiter between the letters.
import pandas as pd
a = pd.DataFrame([{'var1': 'abc', 'var2': 1},
{'var1': 'def', 'var2': 2}])
b = pd.DataFrame([{'var1': 'a', 'var2': 1},
{'var1': 'b', 'var2': 1},
{'var1': 'c', 'var2': 1},
{'var1': 'd', 'var2': 2},
{'var1': 'e', 'var2': 2},
{'var1': 'f', 'var2': 2}])
This is what I want to achieve.
>>> a
var1 var2
0 abc 1
1 def 2
>>> b
var1 var2
0 a 1
1 b 1
2 c 1
3 d 2
4 e 2
5 f 2
.split()
does not work on empty characters ("").
pd.concat([Series(row['var2'], row['var1'].split(','))
for _, row in a.iterrows()]).reset_index()
therefore, this above does not work. Any idea how I can achieve that?