I have a pandas dataframe full of tuple (it could be the same with arrays) and I would like to split all the columns into even more columns (each array or tuple has the same length). Let's take this as an example:
df=pd.DataFrame([[(1,2),(3,4)],[(5,6),(7,8)]], df.columns=['column0', 'column1'])
which outputs:
column0 column1
0 (1, 2) (3, 4)
1 (5, 6) (7, 8)
I tried to build over this solution here(https://stackoverflow.com/a/16245109/4218755) using derivates off the expression:
df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})
like
df.column0.apply(lambda s: pd.Series({'feature1':s[0], 'feature2':s[1]})
which outputs:
feature1 feature2
0 1 2
1 5 6
This is the desired behavior. So it works well, but if I happen to try to use
df2=df[df.columns].apply(lambda s: pd.Series({'feature1':s[0], 'feature2':s[1]}))
then df2 is:
colonne0 colonne1
feature1 (1, 2) (3, 4)
feature2 (5, 6) (7, 8)
which is obviously wrong. I can't either apply on df, it output the same result as df2.
How to apply such splitting technique to a whole dataframe, and are there alternatives? Thanks