I have a dataframe in which I am trying to retrieve data only if certain value exisits in a specific column. If the value in a specific column is empty or None, that particular row should be discarded. Below is the sample dataset:
| A | B | C |
| ----- | ----|---|
| Test1 | 111 | y |
| Test2 | 222 | y |
| Test3 | 333 | |
| Test4 | 444 | y |
If the data is missing in column C
then that row should be discarded.
The desired output is:
| A | B | C |
| ----- | ----|---|
| Test1 | 111 | y |
| Test2 | 222 | y |
| Test4 | 444 | y |
I am trying to achieve this using below snippet but, this produces duplicates also
for x in df["C"]:
if x == "y":
for y in df["A"]:
sample.append(y)
for z in df["B"]:
sample_1.append(z)
else:
continue
Is it also possible to use df.dropna()
in this case?