0

How can I append a dataframe to another dataframe which is already saved in a file without loading it from the file? (Python 3.6 & Pandas 1.0.1)
Example:

import pandas as pd

data = [[['A01','A02'],'B0','C0'],[['A11','A12'],'B1','C1'],[['A21','A22'],'B2','C2']]
df = pd.DataFrame(data,columns=['A','B','C'])

data2 = [[['A31','A32'],'B3','C3'],[['A41','A42'],'B4','C4'],[['A51','A52'],'B5','C5']]
df2 = pd.DataFrame(data2,columns=['A','B','C'])

print(df.append(df2,ignore_index=True))

#version 1:
store = pd.HDFStore('test.h5','a')
store.append(key='foo',value=df)#, format='t', data_columns=True)
store.append(key='foo',value=df2)#, format='t', data_columns=True, append=True)

#version 2
df.to_hdf(path_or_buf='test.h5',key='foo',mode='w',format='t')
df2.to_hdf(path_or_buf='test.h5',key='foo',mode='a',append=True,format='t',data_columns=True)

#version 3
df.to_hdf(path_or_buf='test.h5',key='foo',mode='w',format='f')
df2.to_hdf(path_or_buf='test.h5',key='foo',mode='a',append=True,format='f',data_columns=True)

df3 = pd.read_hdf('test.h5',key='foo',mode='r')
print(df3)

version 1: TypeError: object of type 'int' has no len()
version 2: TypeError: object of type 'int' has no len()
version 3: ValueError: Can only append to Tables

This question was asked similarly here but quite a while ago. I tried it with an older pandas version but this causes even more problems.

EDIT:
It seems that the issue are the arrays as content. If I use only the Bs and Cs, like so, it works:

data = [['B0','C0'],['B1','C1'],['B2','C2']]
df = pd.DataFrame(data,columns=['B','C'])
data2 = [['B3','C3'],['B4','C4'],['B5','C5']]
df2 = pd.DataFrame(data2,columns=['B','C'])

Does anybody know a possibility how I can get it to work despite using arrays as content?

DN98
  • 21
  • 3
  • _This question was asked similarly here but quite a while ago. I tried it with an older pandas version but this causes even more problems._ Did you try the solution, or something similar, with a recent Pandas version? – AMC Mar 14 '20 at 00:17
  • @AMC yes I did but there are more issues with incompatible python versions and so on. I use pandas 1.0.1 and in the other question they used pandas 0.11. – DN98 Mar 15 '20 at 20:32
  • What do you mean? I was just asking if you tried the solution with a recent Pandas version, and if it doesn't work straight away, see if the general method can be kept and tweak the code to modernize it. – AMC Mar 15 '20 at 20:33
  • Yes I did try the solution with a recent pandas version and it does not work. I tried it with the outdated version of pandas and this causes more problems with the python version I use and other libraries. And yes I tried to tweek the old solution to find a better one but non of them worked. – DN98 Mar 26 '20 at 11:47

0 Answers0