0

Is there a possibility to get the first value from a filtered dataframe without having to copy and reindexing the whole dataframe?

Lets say I have a dataframe df:

index statement name
1 True 123
2 True 456
3 True 789
4 False 147
5 False 258
6 True 369

and I want to get the name of the first row with statement that is False.

I would do:

filtered_df = df[df.statement == False]
filtered_df = reset_index(drop=True)
name = filtered_df.loc[0, "name"]

but is there an easier/faster solution to this?

Michael Delgado
  • 13,789
  • 3
  • 29
  • 54
janbo
  • 1
  • 2
  • 2
    The first value? Row-wise, or column-wise? If it's about the position (first, second, etc), use `.iloc`. – 9769953 Apr 09 '22 at 11:53
  • I want the name of the first row that has statement = True – janbo Apr 09 '22 at 11:55
  • 1
    `df[df.statement].name.iloc[0]`. – 9769953 Apr 09 '22 at 11:57
  • Your question says "statement that is False", your code shows an example for the case of True, and your comment says "that has statement = True". Which one is it? – 9769953 Apr 09 '22 at 11:58
  • One way would be to write a function, which uses a `for` loop to loop over the rows of the df, and returns the name when it finds a row with statement = False. But there is probably a better way – Lecdi Apr 09 '22 at 11:58
  • what does the "~" exactly do here? – janbo Apr 09 '22 at 12:09
  • I mean filtered_df = df[df.statement == False], I edited the question now, sorry. – janbo Apr 09 '22 at 12:11
  • for a boolean (True/False) series, `df[~df.statement]` is equivalent to `df[df.statement == False]`. You never need to write `df[df.statement == True]` because `df.statement == True` would evaluate to `df.statement`. The `~` operator simply flips the `True/False` values. See the [pandas docs on boolean indexing](https://pandas.pydata.org/docs/user_guide/indexing.html#boolean-indexing). – Michael Delgado Apr 09 '22 at 16:56
  • Does this answer your question? [Get first row of dataframe in Python Pandas based on criteria](https://stackoverflow.com/questions/40660088/get-first-row-of-dataframe-in-python-pandas-based-on-criteria) – jjbskir Jul 24 '23 at 19:54

2 Answers2

0

If it is for the name of the first statement that is False, then use

df[~df.statement].name.iloc[0]

The ~ inverts the selection ("negates"), so only the rows where df.statement equals False are selected.

Then the name column of that selection is selected, then the first item by position (not index) is selected using .iloc[0].

9769953
  • 10,344
  • 3
  • 26
  • 37
0

The best approach to make your code neat and more readable is to use pandas method chaining.

df.query('statement == False').name.iloc[0]

Generally the .query() method improves the code readability while performing filtering operations.

Aditya Bhatt
  • 823
  • 1
  • 8
  • 7