Apply transformation only on string columns with Pandas, ignoring numeric data

Question

So, I have a pretty large dataframe with 85 columns and almost 90,000 rows and I wanted to use str.lower() in all of them. However, there are several columns containing numerical data. Is there an easy solution for this?

> df

    A   B   C
0   10  John    Dog
1   12  Jack    Cat
2   54  Mary    Monkey
3   23  Bob     Horse

Than, after using something like df.applymap(str.lower) I would get:

> df

    A   B   C
0   10  john    dog
1   12  jack    cat
2   54  mary    monkey
3   23  bob     horse

Currently it's showing this error message:

TypeError: descriptor 'lower' requires a 'str' object but received a 'int'

score 7 · Accepted Answer · answered Oct 15 '20 at 15:15

7

From pandas 1.X you can efficiently select string-only columns using select_dtypes("string"):

string_dtypes = df.convert_dtypes().select_dtypes("string")
df[string_dtypes.columns] = string_dtypes.apply(lambda x: x.str.lower())

df
    A     B       C
0  10  john     dog
1  12  jack     cat
2  54  mary  monkey
3  23   bob   horse

df.dtypes

A     int64
B    string
C    string
dtype: object

This avoids operating on non-string data.

answered Oct 15 '20 at 15:15

cs95

379,657
97
704
746

1

To make it a bit more efficient I tend to only use 1 row for select_dtypes like: `df.convert_dtypes().head(1).select_dtypes("string")` – Erfan Oct 15 '20 at 15:20
1

@Erfan Yes, that'd work if you were only using `select_dtypes` to select the column names and then calling `apply` on df[selected_types.columns]`. In this case I'm applying str.lower on `selected_dtypes` so I can't do that, but good callout. – cs95 Oct 15 '20 at 15:22

score 0 · Answer 2 · answered Oct 15 '20 at 15:15

0

Sure, select your str columns first using select_dtypes("object"):

df[df.select_dtypes("object").columns].applymap(str.lower)

answered Oct 15 '20 at 15:15

arnaud

3,293
1
10
27

score 0 · Answer 3 · answered Oct 15 '20 at 15:42

0

df.apply(lambda x:[x.str.lower() if x.dtypes==object else x])

answered Oct 15 '20 at 15:42

annie

111
2
10

Apply transformation only on string columns with Pandas, ignoring numeric data

3 Answers3

Linked