How to filter rows in a dataframe using in and notin

Question

I have the following problem

import pandas as pd
import numpy as np


data = {
  "ID": [420, 380, 390, 540,520],
  "duration": [50, 40, 45,33,19],
  "previous": [125  , 540 , 420 , 777 , 390  ],
  "next":[390,880  ,520  ,380   ,810   ]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df)

#search for IDs where previous are not in IDs

#print(data[~data['previous'].isin(data['ID'])])

#print(data[~data.previous.isin(data.ID)])

print(data[~data.next.isin(data.ID)])

As you can see I have a dataframe

    ID  duration  previous  next
0  420        50       125   390
1  380        40       540   880
2  390        45       420   520
3  540        33       777   380
4  520        19       390   810

and I want to find the rows where the "next" (or the "previous") data is not in the IDs

for example the row 0 has a previous (125) which does not exist in IDs or the row 4 has a next(810) which does not exist in IDs

I tried to apply this solution but I get the error "AttributeError: 'dict' object has no attribute 'next'"

Use `df` (the `DataFrame`), not `data` (the dictionary): `df[~df['next'].isin(df['ID'])]` works fine — mozway, Dec 17 '22 at 01:43

How to filter rows in a dataframe using in and notin

0 Answers0