0

I am newish to python and trying to make sense of pythonic/pandas ways of doing things.

I have two data frames and I am trying to find the items in one that are not in the other.

df1   =pd.DataFrame({'items': ['shoes', 'socks', 'shoes'],
                     'coors': ['brown', 'red', 'black'],
                   'number': [1, 2, 3]})

df2   =pd.DataFrame({'items': ['shoes', 'socks', 'shoes'],
                     'coors': ['brown', 'red', 'pink'],
                   'number': [4, 5, 6]})

i.e.

fancy_subtract(df2,df1) = 3 brown shoes, 3 red socks,6 pink shoes
fancy_subtract(df1,df2) = 3 black shoes

I've tried subtracting the data frames (which didn't work for obvious reasons), clearly, you can do it via a for loop but that doesn't feel elegant, or like it is taking advantage of how python/pandas works.

Abijah
  • 512
  • 4
  • 17
  • Does this answer your question? [Python Pandas - Find difference between two data frames](https://stackoverflow.com/questions/48647534/python-pandas-find-difference-between-two-data-frames) – Michael M. Sep 17 '22 at 16:00
  • This seems like more of a job for a [Counter](https://docs.python.org/3/library/collections.html#collections.Counter) than a dataframe, but I'm not a Pandas expert. – wjandrea Sep 17 '22 at 16:03
  • 1
    Something similar to [this answer](/a/54540883/15497888) _e.g._ `new_df = df2.set_index(['items', 'coors']).sub(df1.set_index(['items', 'coors']), fill_value=0)` – Henry Ecker Sep 17 '22 at 16:10

0 Answers0