0

I have written a function which takes in data from a database, returns this into a list which then has the following format:

df_master = []
#x = arbitrary data from DB
for i in db_list:
df_tmp = df_tmp.append(ReadDBValues(i, interval, start_date, end_date))
df_master.append(df_tmp) 

enter image description here

However, this also means flattening the data is somewhat troublesome. I have used the following approach: flat = [item for sublist in df_master for item in sublist]

Which yields [1,0,0,1] as in, it returns the 4 columns but not the associated values with each column.

I was hoping to be able to convert this into a dataframe as such:

W | X | Y | Z ....
1 | 2 | 3 | 4 ...
  |   |   |   ....

I have been using this as my reference: Making a flat list out of list of lists in Python

But, I can't seem to flatten more than the first two columns. Could I please get any further guidance?

Thank you very much.

EDIT: I have now managed to create a 'unique' index for the data so I retain the column names. However, the problem is that say there are two columns; 1400 rows in the first column and 1400 in the second.

The code will do the following:

Date | Val X | Val Y
....   1398     NaN
....   1399     NaN
       1400     NaN
       NaN       1
       NaN       2

When instead it should be:

Date | Val X | Val Y
....   1398     523
....   1399     242
       1400     112

Any ideas?

EDIT: Using a GroupBy Index has not proven successful either and results in just NaN values appearing.

(df_master.groupby(df_master.index).sum())

Can anyone please point me in the right direction?

IronKirby
  • 708
  • 1
  • 7
  • 24
  • I think you are looking for `pd.concat(df)` – BENY Jul 10 '18 at 02:04
  • Hi @Wen, I'm not quite looking for concat as I'm not quite stringing a number of dataframes together. This is because df_master is a list and not a dataframe. When I try to use a dataframe, the result unfortunately stores only the last entry while the list seems to capture everything. – IronKirby Jul 10 '18 at 02:10
  • @Wen - thanks for the tip. I have tried using concat but this didn't quite work out how I had hoped. pd.concat([df_tmp, df_master], axis=1, ignore_index=True) due to the dates being the same, it doesn't see these as unique indexes. – IronKirby Jul 10 '18 at 04:16

0 Answers0