1

I have the following pandas dataframe:

+---+-------------+-------------+
|   | Col1        |             |
+   +-------------+-------------+
|   | Sub1 | Sub2 | SubX | SubY |
+---+------+------+------+------+
| 0 | N    | A    | 1    | Z    |
| 1 | N    | B    | 1    | Z    |
| 2 | N    | C    | 2    | Z    |
| 3 | N    | D    | 2    | Z    |
| 4 | N    | E    | 3    | Z    |
| 5 | N    | F    | 3    | Z    |
| 6 | N    | G    | 4    | Z    |
| 7 | N    | H    | 4    | Z    |
+---+------+------+------+------+

I would like to filter the dataframe by column SubX, the selected rows should have the value 3, like this:

+---+-------------+-------------+
|   | Col1        |             |
+   +-------------+-------------+
|   | Sub1 | Sub2 | SubX | SubY |
+---+------+------+------+------+
| 4 | N    | E    | 3    | Z    |
| 5 | N    | F    | 3    | Z    |
+---+------+------+------+------+

Could you help to find the right pandas query? It's pretty hard for me, because of the nested column structure. Thanks a lot!

Jules
  • 21
  • 2
  • 1
    In your example, you are probably missing a `Col2` or some other identifier above `SubX` or `SubY`, don't you? otherwise, what the sense of the division between the first and the last two columns? – HerrIvan Jun 20 '18 at 09:28

1 Answers1

2

I extended multiindex hierarchie because it wasn't clear to me what the blank space should be.

df

    Col1            Col2
    Sub1    Sub2    SubX    SubY
0   N       A       1       Z
1   N       B       1       Z
2   N       C       2       Z
3   N       D       2       Z
4   N       E       3       Z
5   N       F       3       Z
6   N       G       4       Z
7   N       H       4       Z

Now do the following:

df[df['Col2','SubX']==3]

Output

    Col1            Col2
    Sub1    Sub2    SubX    SubY
4   N       E       3       Z
5   N       F       3       Z
Mohamed Thasin ah
  • 10,754
  • 11
  • 52
  • 111
pythonic833
  • 3,054
  • 1
  • 12
  • 27