Pandas read_csv dtype=object column contains numbers

Question

I have a DataFrame column with alphanumeric IDs - some numbers, some letters, some both. I am using read_csv to read the data and want to read all the values of this column as strings. I can't change the values in the underlying data.

I have tried to set the dtype for the column as an object

df = pd.read_csv(filename, dtype = {col: object})

I have also tried to use a converter to change all the values in the columns to strings.

df = pd.read_csv(filename, converters = {i: str for i in col})

However, I still end up with some non-string numbers (12345) and some string numbers ('12345') which mess up my aggregations.

Any suggestions? Thanks!

You may find responses to [this question](https://stackoverflow.com/questions/40095712/when-to-applypd-to-numeric-and-when-to-astypenp-float64-in-python) helpful. — brentertainer, Jul 22 '19 at 04:31
df = pd.read_csv(filename, dtype = {'col': object}) . I guess you missed the single inverted commas which covers the col. please check it in your code. — Madhur Yadav, Jul 22 '19 at 04:40
col is a variable with a column name, rather than the name of a column. — David Huang, Jul 22 '19 at 05:34

score 0 · Answer 1 · answered Jul 22 '19 at 04:23

0

You can also try:

df['column'] = df['column'].apply(lambda x: str(x))

answered Jul 22 '19 at 04:23

snapcrack

1,761
3
20
40

score 0 · Answer 2 · answered Jul 22 '19 at 05:04

0

Use:

df = pd.read_csv(filename, dtype = {i: str for i in col})

The only difference from this and the first one is I do dtype not converter, it's basically a merge of the two.

answered Jul 22 '19 at 05:04

U13-Forward

69,221
14
89
114

Pandas read_csv dtype=object column contains numbers

2 Answers2