I have some code that reads in multiple csv files into a pandas dataframe. The problem is that the first two lines of all the files need to be ignored and I cannot figure out how to do this.
import pandas as pd
import glob
import os
path = r'D:\E\Traficc\migration\Zambia-Mining\DATA\24monthimport' # use your path
all_files = glob.glob(os.path.join(path, "*.csv")) # advisable to use os.path.join as this makes concatenation OS independent
df_from_each_file = (pd.read_csv(f) for f in all_files)
data = pd.concat(df_from_each_file, ignore_index=True)
# doesn't create a list, nor does it append to one
print(data.tail())
I have tried to use next(df)
but I am getting an error that df is not iterable.
Can I do all this within the existing 1 line loop or do I need to break it up? What can I use to accomplish this?