I have several data files (from excel) converted to pandas format and contained in a dictionary. Here is how i obtain the dataframes:
dataList = files
nameList = []
for raster in dataList:
path_list = raster.split(os.sep)
name = path_list[5][:-4]
nameList.append(name)
dataDct = {}
for k, v in zip(nameList,dataList):
dataDct[k] = read_excel(v).rename(columns={'Main Value': 'Main Value '+k, 'Auxiliary Value': 'Auxiliary Value '+k})
I combine them using:
(the dictionary being dataDct
)
concat(dataDct.values(), join='outer', ignore_index=False).to_excel(writer, sheet_name='values').to_excel(writer, sheet_name='values')
writer.save()
This outputs a file wich only puts the files one after the other in a long file, disregarding the common fields they share... So it's difficult to use it for further analysis.
Here is a sample of the dataframes contained in the dataDct
:
IN [2]:
value(HIB)
1 CODE VALUE_HIB AUX_VAL_HIB
2 F.F 00000
3 0.0.1
4 0.0.2 06-02-2016
5 C.6 XYZ-21555FFF
6 3.8.0*1 45000GHZ 01.01.2016
Value (HIC)
1 CODE VALUE_HIC AUX_VAL_HIC
2 F.F 00000 111111
3 0.0.1
4 0.0.3 06-02-2016
5 C.6 XYZ-216666FFF
6 3.9.0*1 65000GHZ 01.02.2016
Value (HID)
1 CODE VALUE_HID AUX_VAL_HID
2 F.F 00000 0101010
3 A.1.1 85 GHZ
4 V.1.1 06-02-2016
5 C.6 XYZ-21776FFF
6 3.9.0*1 3000GHZ 01.02.2016
Expected output would be:
OUT[2]:
1 CODE VALUE_HIB AUX_VAL_HIB VALUE_HIC AUX_VAL_HIC VALUE_HID AUX_VAL_HID
2 F.F 00000 00000 111111 00000 01010101
3 0.0.1
4 0.0.2 06-02-2016
5 0.0.3 06-02-2016
6 A.1.1 85ghz
7 C.6 XYZ-21555FFF XYZ-216666FFF XYZ-21776FFF
8 V.1.1 06-02-2016
9 3.8.0*1 45000GHZ 01.01.2016
10 3.9.0*1 65000GHZ 01.02.2016 3000GHZ 01.02.2016
The idea being that the data is aligned and joined... I have tried joining on the axis:
concat(dataDct.values(), join='outer', ignore_index=False, axis=1).to_excel(writer, sheet_name='values')
The data is better displayed but is not joined on similar values :( Only joined on DAATAFRAMES displaying one after another... here is the example output of the error:
1 CODE VALUE_HIB AUX_VAL_HIB VALUE_HIC AUX_VAL_HIC
2 F.F 00000
3 0.0.1
4 0.0.2 06-02-2016
5 C.6 XYZ-21555FFF
6 3.8.0*1 45000GHZ 01.01.2016
F.F 00000 111111
0.0.1
0.0.3 06-02-2016
C.6 XYZ-216666FFF
3.9.0*1 65000GHZ 01.02.2016
Any ideas on what would be the correct syntax to combine the dictionary and obtain the desired output?