2

I have 3 lists, say A, B, & C that look like this:

A = ["Jan", "Feb", "Mar"]
B = ["location1", "location2", "location3"]
C = [pd.DataFrame(np.random.randn(2, 3), columns=list('ABC'))]*3

How do I create a multiindex dataframe with 3 levels of indexing, i.e. of A, B and the two indices (0 to 1) of C?

What I've tried so far is creating a dictionary with all the three indices by looping through all the 3 lists and the dataframe columns, and then, when I have to slice the data, again looping over the keys with a condition. This works but seems pretty inefficient as I have 10 million items in my dictionary. I was wondering if there was a one liner Pandas syntax for this using multiindex and stack. I saw the documentation for these, but the syntax is not clear to me yet. Am I missing any other Pandas function that can help with this? And should I use a database instead?

Solomon Vimal
  • 920
  • 12
  • 27

1 Answers1

2

IIUC:
See my post on pd.concat for more information
Or the documentation

pd.concat(C, keys=list(zip(A, B)), axis=1)

        Jan                           Feb                           Mar                    
  location1                     location2                     location3                    
          A         B         C         A         B         C         A         B         C
0 -0.519821 -1.820451 -0.645732 -0.519821 -1.820451 -0.645732 -0.519821 -1.820451 -0.645732
1  1.043477 -0.666565 -0.298504  1.043477 -0.666565 -0.298504  1.043477 -0.666565 -0.298504

Alternatively

pd.concat(dict(zip(zip(A, B), C)), axis=1)
piRSquared
  • 285,575
  • 57
  • 475
  • 624