1

I have a DataFrame column with alphanumeric IDs - some numbers, some letters, some both. I am using read_csv to read the data and want to read all the values of this column as strings. I can't change the values in the underlying data.

I have tried to set the dtype for the column as an object

df = pd.read_csv(filename, dtype = {col: object})

I have also tried to use a converter to change all the values in the columns to strings.

df = pd.read_csv(filename, converters = {i: str for i in col})

However, I still end up with some non-string numbers (12345) and some string numbers ('12345') which mess up my aggregations.

Any suggestions? Thanks!

  • You may find responses to [this question](https://stackoverflow.com/questions/40095712/when-to-applypd-to-numeric-and-when-to-astypenp-float64-in-python) helpful. – brentertainer Jul 22 '19 at 04:31
  • df = pd.read_csv(filename, dtype = {'col': object}) . I guess you missed the single inverted commas which covers the col. please check it in your code. – Madhur Yadav Jul 22 '19 at 04:40
  • col is a variable with a column name, rather than the name of a column. – David Huang Jul 22 '19 at 05:34

2 Answers2

0

You can also try:

df['column'] = df['column'].apply(lambda x: str(x))
snapcrack
  • 1,761
  • 3
  • 20
  • 40
0

Use:

df = pd.read_csv(filename, dtype = {i: str for i in col})

The only difference from this and the first one is I do dtype not converter, it's basically a merge of the two.

U13-Forward
  • 69,221
  • 14
  • 89
  • 114