Is there a way, without reading the file twice, to check if a column exists otherwise use column names passed? I have files of the same structure but some do not contain a header for some reason.
Example with header:
Field1 Field2 Field3
data1 data2 data3
Example without header:
data1 data2 data3
When trying to use the example below, if the file has a header it will make it the first row instead of replacing the header.
pd.read_csv('filename.csv', names=col_names)
When trying to use the below, it will drop the first row of data of there is no header in the file.
pd.read_csv('filename.csv', header=0, names=col_names)
My current work around is to load the file, check if the columns exist or not, then if it doesn't read the file again.
df = pd.read_csv('filename.csv')
if `Field1` not in df.columns:
del df
df = pd.read_csv('filename.csv', names=col_names)
Is there a better way to handle this data set that doesn't involve potentially reading the file twice?