Set-up
I have two pandas data frames df1
and df2
, each containing two columns with observations for id and its respective url,
| id | url | | id | url |
------------ ------------
| 1 | url | | 2 | url |
| 2 | url | | 4 | url |
| 3 | url | | 3 | url |
| 4 | url | | 5 | url |
| 6 | url |
Some observations are in both dfs, which is clear from the id
column, e.g. observation 2
and it's url
are in both dfs.
The positioning within the dfs of those 'double' observations does not necessarily have to be the same, e.g. observation 2
is in first row in df1
and second in df2
.
Lastly, the dfs do not necessarily have the same number of observations, e.g. df1
has four observations while df2
has five.
Problem
I want to elicit all unique observations in df2
and insert them in a new df (df3)
, i.e. I want to obtain,
| id | url |
------------
| 5 | url |
| 6 | url |
How do I go about?
I've tried this answer but cannot get it to work for my two-column dataframes.
I've also tried this other answer, but this gives me an empty common
dataframe.