I've got a really wide database in Excel that I'm trying to unstack into a DF.
The data looks like this:
Date_1 Date_2 Date_3... ...Date_n
A1 B1 C1 N1
A2 B2 C2 N2
A3 B3 C3 N3
A4 B4 C4 N4
The output I'm shooting for adds some column names and is a dataframe that looks like this:
Dates Data
0 Date_1 A1
1 A2
2 A3
3 A4
4 Date_2 B1
... .... ...
N Date_N N1 etc
My idea here is that I can then concatenate these two columns and use them as unique identifiers to merge this dataframe with others
I've tried two different variants of unstack here but neither have been successful. The first pass looks like this
df = pd.read_Excel('DataFile.xls', sheet_name = 'Data').unstack()
The output here is
Date_1 0 A1
1 A2
... ...
Date_N N N1 etc
So it's close to what I'm looking for except (a) my data doesn't have any columns headers; and (b) I have what looks like a series of (0,1,...N) in the middle of the outputs. I thought this might be the index but I'm not sure why it would be placed here.
More in hope than expectation, I then tried naming the columns via
df = pd.read_Excel('DataFile.xls', sheet_name = 'Data', names = ['Dates', 'Data').unstack()
I wasn't surprised when it didn't work...I got the error message:
ValueError: Number of passed names did not match number of header fields in the file
So now I am stuck and would enormously appreciate some wisdom!
Thank you in advance