How to get pandas.DataFrame columns containing specific dtype

Question

I'm using df.columns.values to make a list of column names which I then iterate over and make charts, etc... but when I set this up I overlooked the non-numeric columns in the df. Now, I'd much rather not simply drop those columns from the df (or a copy of it). Instead, I would like to find a slick way to eliminate them from the list of column names.

Now I have:

names = df.columns.values

what I'd like to get to is something that behaves like:

names = df.columns.values(column_type=float64)

Is there any slick way to do this? I suppose I could make a copy of the df, and drop those non-numeric columns before doing columns.values, but that strikes me as clunky.

Welcome any inputs/suggestions. Thanks.

https://stackoverflow.com/questions/25039626/how-do-i-find-numeric-columns-in-pandas — Gusev Slava, Feb 08 '19 at 10:05

score 25 · Accepted Answer · answered Jul 23 '14 at 05:09

Someone will give you a better answe than this possibly, but one thing I tend to do is if all my numeric data are int64 or float64 objects, then you can create a dict of the column data types and then use the values to create your list of columns.

So for example, in a dataframe where I have columns of type float64, int64 and object firstly you can look at the data types as so:

DF.dtypes

and if they conform to the standard whereby the non-numeric columns of data are all object types (as they are in my dataframes), then you can do the following to get a list of the numeric columns:

[key for key in dict(DF.dtypes) if dict(DF.dtypes)[key] in ['float64', 'int64']]

Its just a simple list comprehension. Nothing fancy. Again, though whether this works for you will depend upon how you set up you dataframe...

I ended up using this because it works and because I'm running 0.14.0 and didn't want to upgrade to 0.14.1 in the middle of my project. Thanks. — Charlie_M, Jul 23 '14 at 16:52

score 24 · Answer 2 · answered Sep 24 '15 at 11:40

24

dtypes is a Pandas Series. That means it contains index & values attributes. If you only need the column names:

headers = df.dtypes.index

it will return a list containing the column names of "df" dataframe.

answered Sep 24 '15 at 11:40

Arthur Zennig

2,058
26
20

score 20 · Answer 3 · answered Jul 23 '14 at 10:06

20

There's a new feature in 0.14.1, select_dtypes to select columns by dtype, by providing a list of dtypes to include or exclude.

For example:

df = pd.DataFrame({'a': np.random.randn(1000),
                   'b': range(1000),
                   'c': ['a'] * 1000,
                   'd': pd.date_range('2000-1-1', periods=1000)})


df.select_dtypes(['float64','int64'])

Out[129]: 
            a    b
0    0.153070    0
1    0.887256    1
2   -1.456037    2
3   -1.147014    3
...

answered Jul 23 '14 at 10:06

chrisb

49,833
8
70
70

`select_dtypes` now also allows selecting more general categories (`df.select_dtypes('number')`, `df.select_dtypes('object')` or `df.select_dtypes('datetime')`, for example). – ayhan Jan 02 '19 at 19:32

score 8 · Answer 4 · answered Jul 19 '18 at 17:25

8

To get the column names from pandas dataframe in python3- Here I am creating a data frame from a fileName.csv file

>>> import pandas as pd
>>> df = pd.read_csv('fileName.csv')
>>> columnNames = list(df.head(0)) 
>>> print(columnNames)

answered Jul 19 '18 at 17:25

J11

455
4
8

score 0 · Answer 5 · edited Sep 27 '18 at 14:43

You can also try to get the column names from panda data frame that returns columnn name as well dtype. here i'll read csv file from https://mlearn.ics.uci.edu/databases/autos/imports-85.data but you have define header that contain columns names.

import pandas as pd

url="https://mlearn.ics.uci.edu/databases/autos/imports-85.data"

df=pd.read_csv(url,header = None)

headers=["symboling","normalized-losses","make","fuel-type","aspiration","num-of-doors","body-style",
         "drive-wheels","engine-location","wheel-base","length","width","height","curb-weight","engine-type",
         "num-of-cylinders","engine-size","fuel-system","bore","stroke","compression-ratio","horsepower","peak-rpm"
         ,"city-mpg","highway-mpg","price"]

df.columns=headers

print df.columns

How to get pandas.DataFrame columns containing specific dtype

5 Answers5