I am trying to combine 3 lists, each of size 10 rows x 25 columns, into one list of size 10 x 75, and then write the result to a csv file. But all my attempts at using straightforward approaches fail, and I cannot figure out why - and how to fix it.
The lists are produced by one and the same function. The column heads of the lists are identical, except for a number that has been added to the column name to distinguish them, as follows.
def do_stuff(x):
producing 10 x 25 list
# Producing list 1 from document 1
df = pd.read_csv("file1.csv", encoding = "utf-8") # open file 1
df1 = do_stuff(df)
# Producing list 2 from document 2
df = pd.read_csv("file2.csv", encoding = "utf-8") # open file 2
df2 = do_stuff(df)
# Producing list 3 from document 3
df = pd.read_csv("file3.csv", encoding = "utf-8") # open file 3
df3 = do_stuff(df)
The data in each of the lists df1, df2, df3 looks something like this:
# List df1 # List df2
[col_A1 Col_B1 Col_C1 etc [col_A2 Col_B2 Col_C2 ...
0 0.96 0.13 0.63. 0 0.56 0.22 0.31
1 0.21 0.33 0.44. 1 0.37 0.92 0.77
.. ..
9 0.11 0.12 0.99] 9 0.34 0.54 0.42]
Attempt 1 To combine these lists into one DataFrame, I have tried:
df1 + df2 + df3
However, this leads to NaN's across the dataset:
# List df1 # List df2
[col_A1 Col_B1 Col_C1 etc [col_A2 Col_B2 Col_C2 ...
0 NaN NaN NaN 0 NaN NaN NaN
1 NaN NaN NaN 1 NaN NaN NaN
.. ..
9 NaN NaN NaN] 9 NaN NaN NaN]
Attempt 2 I have also tried the following, but get an error:
df_total.append(df1) # add result to list
df_total.append(df2) # add result to list
df_total.append(df3) # add result to list
df_total.to_csv("result.csv", index=False)
AttributeError: 'list' object has no attribute 'to_csv'
Attempt 3 I have tried to make sure to get a DataFrame which can be written to a csv:
df_total.append(df1) # add result to list
df_total.append(df2) # add result to list
df_total.append(df3) # add result to list
df = pd.DataFrame([df_total])
df = pd.concat(df)
But this gives NaN values across the dataset:
col_A1 col_B1 col_C1 ... col_A2 col_B2 col_C2 ...
0 0.96 NaN 0.63 ... 0.56 NaN NaN
1 NaN 0.33 NaN ... NaN 0.92 NaN
.. ...
9 0.11 NaN NaN ... 0.34 0.54 0.42]
Question: Can anyone explain what is going on? And how to get the three lists df1, df2, df3 into one DataFrame that looks like below and can be written to a csv?
col_A1 col_B1 col_C1 ... col_A2 col_B2 col_C2 ...
0 0.96 0.13 0.63 ... 0.56 0.22 0.31
1 0.21 0.33 0.44 ... 0.37 0.92 0.77
.. ...
9 0.11 0.12 0.99 ... 0.34 0.54 0.42]