0

How do I return a dataframe that has values in a particular range for all the columns. My dataframe is currently structured like this:

     California    Texas    New York ...
     200000        151000   900000
     50000         160000   250000
     75000         120000   171000
       .              .        .
       .              .        .
       .              .        .

How do I return a new dataframe containing values in the range 150000-200000 for all the columns

Prog101
  • 109
  • 2
  • 10
  • 2
    Possible duplicate of [How to select a range of values in a pandas dataframe column?](https://stackoverflow.com/questions/38884466/how-to-select-a-range-of-values-in-a-pandas-dataframe-column) – wwii Nov 17 '19 at 18:45
  • You only want rows where all column values meet the criteria? Maybe you should include example input and desired result: [mcve] – wwii Nov 17 '19 at 20:30

2 Answers2

0

I hope this might be your solution:

df['California'].between(150000, 200000, inclusive=False)

Here, inclusive determines that whether you want to include the edges or not. True is equal to <= and >= while, False means < or >

theashwanisingla
  • 412
  • 4
  • 13
0
>>> df
     0    1    2    3    4
0   33  131   52  122   40
1  235  146   36    4   97
2   90  227   49   49   58
3  192   61  127  220  254
4  124  234  238  215   34
5   86    3  220  105  129
6   59  234  189  193  190
7  116  131   95   89  102
8   72   90  253  167  203
9   21  111  203   55  118

Define the condition/criteria

>>> gt = df > 100
>>> lt = df < 150
>>> mask = np.logical_and(lt,gt)

Use .any(axis=1) if any column can meet the criteria

>>> mask.any(1)
0     True
1     True
2    False
3     True
4     True
5     True
6    False
7     True
8    False
9     True
dtype: bool
>>> df[mask.any(1)]
     0    1    2    3    4
0   33  131   52  122   40
1  235  146   36    4   97
3  192   61  127  220  254
4  124  234  238  215   34
5   86    3  220  105  129
7  116  131   95   89  102
9   21  111  203   55  118

If all the columns have to meet the criteria use .all(axis=1)

>>> lt = df < 180
>>> gt = df > 30
>>> mask = np.logical_and(lt,gt)
>>> mask.all(1)
0     True
1    False
2    False
3    False
4    False
5    False
6    False
7     True
8    False
9    False
dtype: bool
>>> df[mask.all(1)]
     0    1   2    3    4
0   33  131  52  122   40
7  116  131  95   89  102
>>> 
wwii
  • 23,232
  • 7
  • 37
  • 77