I am going cross eyed on this and hopefully someone can answer the question for me. I have a list of DataFrames with only a timestamp as an index and a column for sales (depending on state). What I am trying to do is combine all the DataFrames in the list and output a new one. Example Dataframes would appear:
| time | USA |
|-----------|-----|
| 12-1-2012 | 1 |
| 12-2-2012 | 1 |
| 12-3-2012 | 1 |
| 12-4-2012 | 3 |
| 12-5-2012 | 5 |
| 12-6-2012 | 6 |
| 12-7-2012 | 4 |
| 12-8-2012 | 2 |
| 12-9-2012 | 5 |
| time | UK |
|-----------|-----|
| 12-1-2012 | 1 |
| 12-2-2012 | 1 |
| 12-5-2012 | 5 |
| 12-6-2012 | 6 |
| 12-7-2012 | 4 |
| 12-8-2012 | 2 |
| 12-9-2012 | 5 |
| time | CAN |
|-----------|-----|
| 12-1-2012 | 1 |
| 12-2-2012 | 1 |
| 12-4-2012 | 3 |
| 12-6-2012 | 4 |
| 12-8-2012 | 2 |
| 12-9-2012 | 5 |
| time | USA | UK | CAN |
|-----------|-----|-----|-----|
| 12-1-2012 | 1 | 1 | 1 |
| 12-2-2012 | 1 | 1 | 1 |
| 12-3-2012 | 1 | NaN | NaN |
| 12-4-2012 | 3 | NaN | NaN |
| 12-5-2012 | 5 | 5 | NaN |
| 12-6-2012 | 6 | 6 | 6 |
| 12-7-2012 | 4 | 4 | 4 |
| 12-8-2012 | 2 | 2 | 2 |
| 12-9-2012 | 5 | 5 | 5 |
The last table is technically what I want to happen. I want all of them to be combined on the time and if it doesn't have that key then just add a NaN
value for now. I am running into some oddities. The frames are stored in a list:
dataframes = [d1,d2,d3]
What I have tried is to use a reduce method while joining and it seems to mangle the data. Here is a full example.
d1 = ts.copy()
d1.columns = ['time']
d1.index = d1['time']
# del d1['time']
d1['USA'] = range(9)
d2 = ts.copy()
d2.columns = ['time']
d2.index = d2['time']
# del d2['time']
d2['UK'] = range(9)
d3 = ts.copy()
d3.columns = ['time']
d3.index = d3['time']
# del d3['time']
d3['UK'] = range(9)
d2.drop(d2.index[[2,3]])
d3.drop(d3.index[[2,4,6]])
dataframes = [d1, d2, d3]
df_final = reduce(lambda left,right: pd.merge(left,right,on='time'), dataframes)
df_final
# Output
time USA UK_x UK_y
0 2012-12-01 0 0 0
1 2012-12-02 1 1 1
2 2012-12-03 2 2 2
3 2012-12-04 3 3 3
4 2012-12-05 4 4 4
5 2012-12-06 5 5 5
6 2012-12-07 6 6 6
7 2012-12-08 7 7 7
8 2012-12-09 8 8 8
Can someone please steer me in the right direction without telling me to read the documentation.
Thanks in advance, Stephen