I have a superset dataframe and subset dataframe. Superset has n number of columns and subset has m (n > m).
Requirement is to compare m columns of subset with matching columns headings from the superset.
Note: dataframes contained data from csv files, subset being reference file and superset being the whole tool output file. And, both the dataframes will vary based on the requirement under test.
e.g.1.
Superset columns:
Car_Brand, Car_model, Color, Year, Engine
Subset columns:
Year, Engine
I have to log failure if the entries of 'Year' 'Engine' are not matching between both the dataframes.
e.g.2:
Superset columns:
Car_Brand, Car_model, Color, Year, Engine, Country, Rating, Price
Subset columns:
Car_model, Rating, Price
I have to log failure if the entries of Car_model, Rating, Price are not matching between both the dataframes.
There are 100s of such different cases, need to write generic way to merge superset & subset based on the column names of subset.
How can I achieve this?
Something like:
common_df = superset_df.merge(subset_df, on=subset_df.columns[0], how='inner')