0

An elegant function like

df[~pandas.isnull(df.loc[:,0])]

can check a pandas DataFrame column and return the entire DataFrame but with all NaN value rows from the selected column removed.

I am wondering if there is a similar function which can check and return a df column conditional on its dtype without using any loops.

I've looked at

.select_dtypes(include=[np.float])

but this only returns columns that have entirely float64 values, not every row in a column that is a float.

Aleks
  • 3
  • 2
  • Please try to include a [mcve] with some small sample data and your desired output. Take a look at [how to create good reproducible pandas dataframe examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – pault Jul 12 '18 at 15:01

1 Answers1

0

First lets set up a DataFrame with two columns. Only column b has a float. We'll try and find this row:

df = pandas.DataFrame({
    'a': ['qw', 'er'],
    'b' : ['ty', 1.98]
})

When printed this looks like:

    a     b
0  qw    ty
1  er  1.98

Then create a map to select the rows using apply()

def check_if_float(row):
    return isinstance(row['b'], float)

map = df.apply(check_if_float, axis=1)

This will give a boolean map of all the rows that have a float in column b:

0    False
1     True

You can then use this map to select the rows you want

filtered_rows =  df[map]

Which leaves you only the rows that contain a float in column b:

    a     b
1  er  1.98
sophiemachin
  • 102
  • 1
  • 7