Goal
I want to read in a csv to a DASK dataframe without getting “Unnamed: 0” column.
CODE
mydtype = {'col1': 'object',
'col2': 'object',
'col3': 'object',
'col4': 'float32',}
do = dd.read_csv('/folder/somecsvname.csv',
dtype = mydtype,
low_memory=False,
parse_dates=['col3'],
)
Result Columns
- Unnamed: 0
- col1
- col2
- col3
- col4
Tried solutions
- 1.works with pandas not with dask - pd.read_csv add column named "Unnamed: 0
- 2.works with pandas not with dask - How to get rid of "Unnamed: 0" column in a pandas DataFrame?
- CODE added to read in:
index_col=False
ERROR message:ValueError: Keywords 'index' and 'index_col' not supported. Use dd.read_csv(...).set_index('my-index') instead
- CODE added to read in:
index_col=0
ERROR message:ValueError: Keywords 'index' and 'index_col' not supported. Use dd.read_csv(...).set_index('my-index') instead
- CODE that recommended by previouse 2 error messages-> DISFUCTION: this just sets up a value as an index but still generates that 'Unnamed: 0' column
do = dd.read_csv('/folder/somecsvname.csv',
dtype = mydtype,
low_memory=False,
parse_dates=['col3'],
).set_index('col3')
- CODE added to read in:
index_col=None
ERROR message:ValueError: Keywords 'index' and 'index_col' not supported. Use dd.read_csv(...).set_index('my-index') instead
- CODE added to read in:
index_col=None, header=0
ERROR message:ValueError: Keywords 'index' and 'index_col' not supported. Use dd.read_csv(...).set_index('my-index') instead