Find quick method for Filling Missing Categorical Values with Categories already in the Same Column using a Second Column as Correspondence

Question

As you can see in the screenshot, the values of the code column correspond with the values of the name column. That's because the code is for the project code and the name is for the project name.

How can I fill the missing values of the name column using the code column as a correspondence? Is there any method that can take care of that?

      code  name                countryshortname
212   1                         Liberia
2     1     Economic management Tunisia
1211  1     Economic management Tonga
1045  1     Economic management Macedonia, former Yugoslav Republic of
363   1                         Cote d'Ivoire
453   1     Economic management Lesotho
784   1     Economic management Tanzania
648   1     Economic management Lao People's Democratic Republic
647   1     Economic management Lao People's Democratic Republic
204   1     Economic management Kyrgyz Republic
205   1     Economic management Kyrgyz Republic
249   1     Economic management Armenia
1437  1                         Guatemala
1212  1     Economic management Kenya
1114  1                         Honduras

[ScreenShot of Actual Dataframe][1]


  [1]: https://i.stack.imgur.com/eq6xt.png

Please provide samples of your dataframes in the question text, not as images. Additionally, what have you tried so far? Please provide a [mcve] — G. Anderson, Mar 27 '19 at 20:14
i haven't tried anything yet, because the drop methods don't look like they would work. i was trying to write a for loop but i figured i should ask about already existing methods before writing a function. i don't know how to provide a sample without typing out a part of the dataframe. — Carlos Rivas, Mar 27 '19 at 20:25
It's hard to tell from your question, but this looks like it could be done with a merge or join maybe. As for providing the data, typing up or copy-pasting a sample _is_ what you should do. `df.head()` or `df.sample()` are great for creating small dfs to paste in — G. Anderson, Mar 27 '19 at 20:30
i believe merge is for merging several dataframes on a column, that's not what i'm doing. — Carlos Rivas, Mar 27 '19 at 20:51
Sorry, I misunderstood the question, but [here is a question](https://stackoverflow.com/questions/29357379/pandas-fill-missing-values-in-dataframe-from-another-dataframe) with some answers you might find helpful — G. Anderson, Mar 27 '19 at 20:57
i don't see the solution in that question. again, i'm using one dataframe. there is a correspondence between code elements in 'code' and the name elements in 'name'. there should be a function that deals with that correspondence when a value is missing from either column, otherwise i should just go ahead and build one on my own. — Carlos Rivas, Mar 27 '19 at 22:26

score 0 · Answer 1 · answered Mar 28 '19 at 15:45

Because there is only one code/name combo in your question, I couldn;t test this more fully. But if you sort by code then name, you should be able to just forward fill the NaN values

df['name']=df.sort_values(['code','name'])['name'].fillna(method='ffill')

        code    name    countryshortname
212     1   Economic    Liberia
2       1   Economic    Tunisia
1211    1   Economic    Tonga
1045    1   Economic    Macedonia
363     1   Economic    Cote
453     1   Economic    Lesotho
784     1   Economic    Tanzania
648     1   Economic    Lao
647     1   Economic    Lao
204     1   Economic    Kyrgyz
205     1   Economic    Kyrgyz
249     1   Economic    Armenia
1437    1   Economic    Guatemala
1212    1   Economic    Kenya
1114    1   Economic    Honduras

Find quick method for Filling Missing Categorical Values with Categories already in the Same Column using a Second Column as Correspondence

1 Answers1