Reclassification by column name in pandas

Question

I would like to apply a test to a pandas dataframe, and create flags in a corresponding dataframe based on the test results. I've gotten this far:

import numpy as np
import pandas as pd


matrix = pd.DataFrame({'a': [1, 11, 2, 3, 4], 'b': [5, 6, 22, 8, 9]})
flags = pd.DataFrame(np.zeros(matrix.shape), columns=matrix.columns)
flag_values = pd.Series({"a": 100, "b": 200})

flags[matrix > 10] = flag_values

but this raises the error

ValueError: Must specify axis=0 or 1

Where can I specify the axis in this situation? Is there a better way to accomplish this?

Edit:

The result I'm looking for in this example for "flags" is

unutbu · Accepted Answer · 2018-02-02T17:52:38.797

5

You could define flags = (matrix > 10) * flag_values:

In [35]: (matrix > 10) * flag_values
Out[35]: 
     a    b
0    0    0
1  100    0
2    0  200
3    0    0
4    0    0

This relies on True having numeric value 1 and False having numeric value 0. It also relies on Pandas' nifty automatic alignment of DataFrames (and Series) based on labels before performing arithmetic operations.

edited Feb 02 '18 at 17:52

answered Feb 02 '18 at 16:39

unutbu

842,883
184
1,785
1,677

score 3 · Answer 2 · answered Feb 02 '18 at 16:35

3

mask with mul

flags.mask(matrix > 10,1).mul(flag_values,axis=1)

Out[566]: 
       a      b
0    0.0    0.0
1  100.0    0.0
2    0.0  200.0
3    0.0    0.0
4    0.0    0.0

answered Feb 02 '18 at 16:35

BENY

317,841
20
164
234

Reclassification by column name in pandas

2 Answers2