Disclaimer: I'm not entirely clear on what your expected output is. My solution below is based on the assumption that you want to replace NA
values with either the mean of all values for every year or with the mean value of all values for every Index
.
Here is a tidyverse
option first spreading from wide to long, replacing NA
s with the mean value per year, and finally converting back from long to wide.
library(tidyverse)
df %>%
gather(year, value, -Index) %>%
group_by(year) %>%
mutate(value = ifelse(is.na(value), mean(value, na.rm = T), value)) %>%
spread(year, value)
## A tibble: 8 x 4
# Index `2015` `2016` `2017`
# <int> <dbl> <dbl> <dbl>
#1 1 115507293. 6355698. 10107023.
#2 2 13000000. 223472356. 186197372.
#3 4 115507293. 223472356. 186197372.
#4 5 115507293. 223472356. 8000000.
#5 6 331000000. 659000000. 1040000000.
#6 7 115507293. 223472356. 32032920.
#7 8 115507293. 223472356. 20000000.
#8 9 2521880. 5061370. 7044288.
Note that here we replace NA
s with mean value per year. If instead you want to replace NA
s with the mean value per Index
value, simply replace group_by(year)
with group_by(Index)
:
df %>%
gather(year, value, -Index) %>%
group_by(Index) %>%
mutate(value = ifelse(is.na(value), mean(value, na.rm = T), value)) %>%
spread(year, value)
## A tibble: 8 x 4
## Groups: Index [8]
# Index `2015` `2016` `2017`
# <int> <dbl> <dbl> <dbl>
#1 1 8231360. 6355698. 10107023.
#2 2 13000000. 13000000. 13000000.
#3 4 NaN NaN NaN
#4 5 8000000. 8000000. 8000000.
#5 6 331000000. 659000000. 1040000000.
#6 7 32032920. 32032920. 32032920.
#7 8 20000000. 20000000. 20000000.
#8 9 2521880. 5061370. 7044288.
Update
To only replace NA
s in column 2017
with the row average based on the 2015
,2016
values you can do
df <- read_table("Index 2015 2016 2017
1 NA 6355698 10107023
2 13000000 73050000 NA
4 NA NA NA
5 10500000 NA 8000000
6 331000000 659000000 1040000000
7 55500000 NA 32032920
8 NA NA 20000000
9 2521880 5061370 7044288")
df %>%
mutate(`2017` = ifelse(is.na(`2017`), 0.5 * (`2015` + `2016`), `2017`))
## A tibble: 8 x 4
# Index `2015` `2016` `2017`
# <int> <int> <int> <dbl>
#1 1 NA 6355698 10107023.
#2 2 13000000 73050000 43025000.
#3 4 NA NA NA
#4 5 10500000 NA 8000000.
#5 6 331000000 659000000 1040000000.
#6 7 55500000 NA 32032920.
#7 8 NA NA 20000000.
#8 9 2521880 5061370 7044288.
Sample data
df <- read_table("Index 2015 2016 2017
1 NA 6355698 10107023
2 13000000 NA NA
4 NA NA NA
5 NA NA 8000000
6 331000000 659000000 1040000000
7 NA NA 32032920
8 NA NA 20000000
9 2521880 5061370 7044288")