2

So my code is organised similar to the following. It creates columns filled with tuples:

import pandas as pd

d = []
d.append({'wilderness':('bear','salmon'), 'domestic':('cat','mouse'), 'farm':('wolf','sheep')})
d.append({'wilderness':('polar bear','seal'), 'domestic':('spider','fly'), 'farm':('cow','grass')})

pd.DataFrame(d)

enter image description here

As per this example, the elements of each tuple are related, here predator and prey. I really do not want to split these tuples into unrelated separate columns, want the close relationship between the pairs to stay somehow within the structure.

The problem is, each string in my example is a fair bit longer than the animal names here, and when I view the dataframe in Jupyter notebook, I cannot see the second element of the tuple at all, and I need to be able to see it, even select it etc.

So initially thought there might be some setting in Jupyter which will make each tuple element go onto a second line. Now think the best solution is probably with pd.MultiIndex.from_tuples() but am having a lot of trouble working out how to use it. Had a look at a few examples here and here.

Does anyone know how to do this? There should be two levels of column heading, eg domestic-predator/prey and the the tuple elements go into each new sub-column.

I try not to use for loops in Pandas and NumPy but this was an occasion where it was hard not to and performance wasn't an issue, so would prefer if the solution stayed with this for loop friendly method of creating the dataframe.

edit - here is the desired output

       domestic              farm                  wilderness
       predator  prey        predator  prey        predator    prey

0      cat       mouse       wolf      sheep       bear        salmon
1      spider    fly         cow       grass       polar bear  seal
cardamom
  • 6,873
  • 11
  • 48
  • 102

1 Answers1

3

You can use concat with list comprehension:

df = pd.concat([pd.DataFrame(x, columns=['predator','prey']) for x in df.values.T.tolist()], 
                axis=1, 
                keys=df.columns)
print (df)

  domestic            farm         wilderness        
  predator   prey predator   prey    predator    prey
0      cat  mouse     wolf  sheep        bear  salmon
1   spider    fly      cow  grass  polar bear    seal
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks, it works! Am studying exactly how. had a look at `df.values.T.tolist()` that's the first I've ever seen a list comprehension used with the `pd.DataFrame` command. I think that `keys` think you used is what did it, doesn't look like it needed the MultiIndex thing after all. – cardamom Jun 21 '17 at 14:32