I've a CSV file. Most of it's values I want to read as string, but I want to read a column as bool if the column with the given title exists..
Because the CSV file has a lots of columns, I don't want to specify on each column the datatype directly and give something like this:
data = read_csv('sample.csv', dtype={'A': str, 'B': str, ..., 'X': bool})
Is it possible to define the string type on each column but one and read an optional column as a bool at the same time?
My current solution is the following (but it's very unefficient and slow):
data = read_csv('sample.csv', dtype=str) # reads all column as string
if 'X' in data.columns:
l = lambda row: True if row['X'] == 'True' else False if row['X'] == 'False' else None
data['X'] = data.apply(l, axis=1)
UPDATE: Sample CSV:
A;B;C;X
a1;b1;c1;True
a2;b2;c2;False
a3;b3;c3;True
Or the same can ba without the 'X' column (because the column is optional):
A;B;C
a1;b1;c1
a2;b2;c2
a3;b3;c3