Say I have a bigger dataframe A and a smaller dataframe B, which is also a subset of dataframe A. There is a matching key for both datasets, say it's called key
.
I want to create a new dataframe, say C, which only keep rows in dataset A which are not in dataset B. For eg. if A contains 1000 rows and B contains 200 rows, therefore C should contain 1000-200 = 1800 rows.
What is the best way of doing this? Using either dataframes or numpy arrays would work.
Many thanks!