0

I have 2 columns of data with the first column being product codes (all filled out) and the second column with product description.

The first column has all the product codes filled out but there are some rows where the product description (second column) is missing.

For example row 200 has a product code of 145 but the description on that row is empty (NaN). However, there are other rows with product code 145 where the description exists, which is "laptop". I would like to have the description of row 200 to be filled with "laptop" because that's the description for that product code.

I want to find a solution where I can fill out all NaN values in the second column (product description) based on the first column (product code).

lazercity
  • 1
  • 1
  • You can look at [`groupby.transform`](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.DataFrameGroupBy.transform.html). then get the `first` per group and `fillna` missing values. I believe with the previous indications you can find out, else please post a [more complete mre](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) to show some effort in posting your question and attempt in solving it. – Ben.T Dec 04 '21 at 03:13
  • df.loc[:col] = df[col].fillna(df.groupby(groupby_col)[col].transform("median")).values – jwzinserl Dec 04 '21 at 03:27

1 Answers1

0

First, decide on a function that takes descriptions and picks out one of them. You could use min, max, mode, define you own get_desc, etc. Then you can separate the dataframe by product code with groupby and apply whatever function you decided on: df.groupby('product code').apply(get_desc) or df.groupby('product code')['product description'].apply(get_desc) depending on whether get_desc takes a dataframe or column as input. Then you can merge the resulting dataframe with your original dataframe. You can either replace the entire original product description column with the product description column of the groupby output, or have merge create a new column, then fillna the old product description with the new product description.

Acccumulation
  • 3,491
  • 1
  • 8
  • 12