0

df has Columns A,B,C,D,E , assume column "A" is a string and rest are numbers.

df["A"].where(df[B] > 100).dropna() returning Column "A" wherever "B" has value > 100

my question is that df["A"] (it's a view of original df) does not have column "B", then how can "where" clause applied with Column B. [where() clause is applied on df["A"] but not on entire "df"]

type of df["A"] is a Pandas Series, even then where() clause on column "B" is bit confusing how this get applied.

intedgar
  • 631
  • 1
  • 11
  • 2
    Please provide a [reproducible minimal example](https://stackoverflow.com/q/20109391/8107362). Especially, provide some [sample data](https://stackoverflow.com/q/22418895/8107362), e.g. with `print(df.to_dict())`. – mnist Nov 21 '21 at 15:45

1 Answers1

0

It's very easy to get all the columns of the dataframe, instead of just A.

Just remove the ["A"] part:

df["A"].where(df["B"] > 100).dropna()

to

df.where(df["B"] > 100).dropna()

Now you can do something like this:

>>> subset = df.where(df["B"] > 100).dropna()
>>> subset["B"]
...
>>> subset["A"]

Note: instead of using where + dropna, a shorter, equivalent solution to the above would be the following.

Instead of

df.where(df["B"] > 100).dropna()

just use

df[df["B"] > 100]
  • Thanks for the explanation. may be I have not framed my question correctly, when where() function is applied on df["A"] (is a subset of original df) and inside where clause checking another column "B" , how can column "B" can be refereced on df["A"] since it does not have column B. – Swami Viswanadh Nov 21 '21 at 16:13
  • You framed your question correctly. If you remove the `["A"]` part, you can do something like `df.where(df["B"] > 100).dropna()["A"]` or `df.where(df["B"] > 100).dropna()["B"]`. –  Nov 21 '21 at 16:16