Python using Pandas - Retrieving the name of all columns that contain numbers

Question

I searched for a solution on the site, but I couldn't find anything relevant, only outdated code. I am new to the Pandas library and I have the following dataframe as an example:

A	B	C	D	E
142	0.4	red	108	front
164	1.3	green	98	rear
71	-1.0	blue	234	front
109	0.2	black	120	front

I would like to extract the name of the columns that contain numbers (integers and floats). It is completely fine to use the first row to achieve this. So the result should look like this: ['A', 'B', 'D']

I tried the following command to get some of the columns that contained numbers:

dataframe.loc[0, dataframe.dtypes == 'int64']

Out:
A 142
D 108

There are two problems with this. First of all, I just need the name of the columns, but not the values. Second, this captures only the integer columns. My next attempt just gave an error:

dataframe.loc[0, dataframe.dtypes == 'int64' or dataframe.dtypes == 'float64']

Does this answer your question? [How to determine whether a column/variable is numeric or not in Pandas/NumPy?](https://stackoverflow.com/questions/19900202/how-to-determine-whether-a-column-variable-is-numeric-or-not-in-pandas-numpy) — Marcelo Paco, Apr 02 '23 at 03:25
It should be `dataframe.loc[0, (dataframe.dtypes == 'int64') | (dataframe.dtypes == 'float64')]`. I don't know why [pandas uses these characters though](https://stackoverflow.com/a/54358361/11235205) — Minh-Long Luu, Apr 02 '23 at 03:29
@MarceloPaco It is one step closer to the solution, but I still don't get the name of the columns that contain numeric values. — Adrian, Apr 02 '23 at 03:30
@Minh-LongLuu Your code did work! However, I still need to retrieve just the column names without any data. — Adrian, Apr 02 '23 at 03:38
@Adrian you should move your comment to here. Btw, you can read only the first row via the parameter nrows: pd.read_csv('your_file.csv', nrows=1) — Minh-Long Luu, Apr 02 '23 at 03:59

score 3 · Accepted Answer · answered Apr 02 '23 at 05:33

3

Using select_dtypes:

dataframe.select_dtypes('number').columns.tolist()

Output:

['A', 'B', 'D']

answered Apr 02 '23 at 05:33

mozway

194,879
13
39
75

1

This is a very compact solution, thank you. – Adrian Apr 03 '23 at 09:47

Driftr95 · Answer 2 · 2023-04-02T04:00:44.693

2

You can use .dtype then .kind while filtering the the column names with list comprehension.

# import pandas as pd
# df = pd.read_html('https://stackoverflow.com/questions/75909965')[0] # scraped your q

[c for c in df.columns if df[c].dtype.kind in 'iufc']

should return ['A', 'B', 'D']. [Note that 'iufc' covers signed and unsigned integers as well as real and complex floating-point numbers. Add b if you want to cover Booleans as well since they're a subclass of int in python....]

edited Apr 02 '23 at 04:00

answered Apr 02 '23 at 03:53

Driftr95

4,572
2
9
21

1

This is also a very interesting solution, I appreciate it. – Adrian Apr 03 '23 at 09:53

score 1 · Answer 3 · answered Apr 02 '23 at 03:31

1

Based on Marcelo's comment, you can use:

from pandas.api.types import is_numeric_dtype

numeric_columns = []
for column in df.columns:
    if is_numeric_dtype(df[column]):
        numeric_columns.append(column)
print(numeric_columns)

answered Apr 02 '23 at 03:31

Minh-Long Luu

2,393
1
17
39

@Adrian you should move your comment to here. Btw, you can read only the first row via the parameter `nrows`: `pd.read_csv('your_file.csv', nrows=1)` – Minh-Long Luu Apr 02 '23 at 03:46
Thank you, this code did the job too. – Adrian Apr 03 '23 at 09:50

PaulS · Answer 4 · 2023-04-02T12:32:11.913

1

Another possibles solution:

import re

df.columns[
    [re.match(r'^(int|float)', x.name) != None for x in df.dtypes]].to_list()

Output:

['A', 'B', 'D']

edited Apr 02 '23 at 12:32

answered Apr 02 '23 at 12:20

PaulS

21,159
2
9
26

Sajil Alakkalakath · Answer 5 · 2023-04-05T19:25:59.970

0

Use the below function:

First it select all the numeric columns, then it finds the columns, which is finally converted into list.

df.select_dtypes(include="number").columns.to_list()

edited Apr 05 '23 at 19:25

answered Apr 02 '23 at 08:38

Sajil Alakkalakath

171
1
6

Python using Pandas - Retrieving the name of all columns that contain numbers

5 Answers5