I am a novice at R programming and try to use it for my data handling.
I am trying to create new data frame by replacing some elements with the most frequently occurring element in my data frame.
My original data frame is like this :
df:
id | first_name | last_name | info_1 |infor_2
---|------------|-----------|--------|-------
1 | Hillary | Clinton | 2 | 3
1 | Hillary | Clinton | 10 | 2
2 | Donald | Trump | 5 | 6
2 | Donald | Trump | 3 | 8
4 | Hillary | Clinton | 9 | 5
3 | Bernie | Sanders | 5 | 0
3 | Donald | Trump | 4 | 9
3 | Bernie | Sanders | 24 | 9
6 | Bernie | Sanders | 24 | 9
The new data frame should look like this:
new_df:
id | first_name | last_name | info_1 |infor_2
---|------------|-----------|--------|-------
1 | Hillary | Clinton | 2 | 3
1 | Hillary | Clinton | 10 | 2
2 | Donald | Trump | 5 | 6
2 | Donald | Trump | 3 | 8
1 | Hillary | Clinton | 9 | 5
3 | Bernie | Sanders | 5 | 0
2 | Donald | Trump | 4 | 9
3 | Bernie | Sanders | 24 | 9
3 | Bernie | Sanders | 24 | 9
As you can see in the first data frame, "1" is the most frequently occurring id for Hillary Clionton, but there appears "4" on the 5th row. So, I want to replace all id for Hillary Clinton by "1". This operation should be applied for all others name (Bernie Sanders and Donald Trump).
To my understanding, it can be done by "if" and "for", but I couldn't find clear solution.
Any help would appreciate!
Joseph