4

So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this?

> df

    A   B   C
0   10  John    Dog
1   12  Jack    Cat
2   54  Mary    Monkey
3   23  Bob     Horse

Than, after using something like df.applymap(str.lower) I would get:

> df

    A   B   C
0   10  john    dog
1   12  jack    cat
2   54  mary    monkey
3   23  bob     horse

Currently it's showing this error message:

TypeError: descriptor 'lower' requires a 'str' object but received a 'int'
cs95
  • 379,657
  • 97
  • 704
  • 746
Joao Donasolo
  • 359
  • 1
  • 9

3 Answers3

7

From pandas 1.X you can efficiently select string-only columns using select_dtypes("string"):

string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.lower())

df
    A     B       C
0  10  john     dog
1  12  jack     cat
2  54  mary  monkey
3  23   bob   horse

df.dtypes

A     int64
B    string
C    string
dtype: object

This avoids operating on non-string data.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • 1
    To make it a bit more efficient I tend to only use 1 row for select_dtypes like: `df.convert_dtypes().head(1).select_dtypes("string")` – Erfan Oct 15 '20 at 15:20
  • 1
    @Erfan Yes, that'd work if you were only using `select_dtypes` to select the column names and then calling `apply` on df[selected_types.columns]`. In this case I'm applying str.lower on `selected_dtypes` so I can't do that, but good callout. – cs95 Oct 15 '20 at 15:22
0

Sure, select your str columns first using select_dtypes("object"):

df[df.select_dtypes("object").columns].applymap(str.lower)
arnaud
  • 3,293
  • 1
  • 10
  • 27
0

df.apply(lambda x:[x.str.lower() if x.dtypes==object else x])

annie
  • 111
  • 2
  • 10