0

I have a large table that I download in my code to several csv files. I then try to read each file and concat them to one DF to perform calculations on the data. I do it like so:

    def merge_multiple_dfs(self):
        path = tmp_path, f"*.csv"
        all_files = glob.glob(os.path.join(tmp_path, f"*.csv"))
        df_list = []
        logger.info(f'reading the fiiles: {all_files} from {path} to DataFrame')

        for f in all_files:
            df_list.append(pd.read_csv(f))

        df = pd.concat((pd.read_csv(f) for f in df_list))
        shutil.rmtree(self.tmp_path)
        df.to_csv('combined.csv')

I get the following error:

Traceback (most recent call last):
  File "/Users/u/.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3291, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-12-659a373a29b2>", line 1, in <module>
    pd.concat((pd.read_csv(f) for f in df_list))
  File "/Users/u/.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 228, in concat
    copy=copy, sort=sort)
  File "/Users/u/.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 259, in __init__
    objs = list(objs)
  File "<ipython-input-12-659a373a29b2>", line 1, in <genexpr>
    pd.concat((pd.read_csv(f) for f in df_list))
  File "/Users/u/.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Users/u/.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/pandas/io/parsers.py", line 413, in _read
    filepath_or_buffer, encoding, compression)
  File "/Users/u/.local/share/virtualenvs/romee-dkO78zKu/lib/python3.6/site-packages/pandas/io/common.py", line 232, in get_filepath_or_buffer
    raise ValueError(msg.format(_type=type(filepath_or_buffer)))
ValueError: Invalid file path or buffer object type: <class 'pandas.core.frame.DataFrame'>

The csv files are in the same format and have the same columns.

Any ideas?

liamhawkins
  • 1,301
  • 2
  • 12
  • 30
NotSoShabby
  • 3,316
  • 9
  • 32
  • 56
  • 2
    Just do pd.concat(df_list), you have already stored DataFrame objects in there. – Ohad Chaet Feb 28 '19 at 16:09
  • Check the syntax of pd.contact and specify an axis. If you're doing row-wise concatenation, consider append() instead. – Matthew Arthur Feb 28 '19 at 16:10
  • @OhadChaet I feel stupid. Thanks! – NotSoShabby Feb 28 '19 at 16:11
  • Possible duplicate of [Import multiple csv files into pandas and concatenate into one DataFrame](https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe) – Erfan Feb 28 '19 at 16:16

0 Answers0