1

In case 2017 is NA and columns of 2015 and 2016 have value, I want to assign average of them to 2017 based on the same row.

Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        73050000        NA
4       NA              NA              NA
5       10500000        NA              8000000
6       331000000       659000000       1040000000
7       55500000        NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288
...

Here is that I tried, didn't work!

ind <- which(is.na(df), arr.ind=TRUE)
df[ind] <- rowMeans(df,  na.rm = TRUE)[ind[,1]]

Also if we have values in 2015 and 2017 columns and 2016 is NA, I want to assign average of them to the column of 2016 based on the same row. Any help would be appreciated!

1 Answers1

1

Disclaimer: I'm not entirely clear on what your expected output is. My solution below is based on the assumption that you want to replace NA values with either the mean of all values for every year or with the mean value of all values for every Index.

Here is a tidyverse option first spreading from wide to long, replacing NAs with the mean value per year, and finally converting back from long to wide.

library(tidyverse)
df %>%
    gather(year, value, -Index) %>%
    group_by(year) %>%
    mutate(value = ifelse(is.na(value), mean(value, na.rm = T), value)) %>%
    spread(year, value)
## A tibble: 8 x 4
#  Index     `2015`     `2016`      `2017`
#  <int>      <dbl>      <dbl>       <dbl>
#1     1 115507293.   6355698.   10107023.
#2     2  13000000. 223472356.  186197372.
#3     4 115507293. 223472356.  186197372.
#4     5 115507293. 223472356.    8000000.
#5     6 331000000. 659000000. 1040000000.
#6     7 115507293. 223472356.   32032920.
#7     8 115507293. 223472356.   20000000.
#8     9   2521880.   5061370.    7044288.

Note that here we replace NAs with mean value per year. If instead you want to replace NAs with the mean value per Index value, simply replace group_by(year) with group_by(Index):

df %>%
    gather(year, value, -Index) %>%
    group_by(Index) %>%
    mutate(value = ifelse(is.na(value), mean(value, na.rm = T), value)) %>%
    spread(year, value)
## A tibble: 8 x 4
## Groups:   Index [8]
#  Index     `2015`     `2016`      `2017`
#  <int>      <dbl>      <dbl>       <dbl>
#1     1   8231360.   6355698.   10107023.
#2     2  13000000.  13000000.   13000000.
#3     4       NaN        NaN         NaN
#4     5   8000000.   8000000.    8000000.
#5     6 331000000. 659000000. 1040000000.
#6     7  32032920.  32032920.   32032920.
#7     8  20000000.  20000000.   20000000.
#8     9   2521880.   5061370.    7044288.

Update

To only replace NAs in column 2017 with the row average based on the 2015,2016 values you can do

df <- read_table("Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        73050000        NA
4       NA              NA              NA
5       10500000        NA              8000000
6       331000000       659000000       1040000000
7       55500000        NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288")


df %>%
    mutate(`2017` = ifelse(is.na(`2017`), 0.5 * (`2015` + `2016`), `2017`))
## A tibble: 8 x 4
#  Index    `2015`    `2016`      `2017`
#  <int>     <int>     <int>       <dbl>
#1     1        NA   6355698   10107023.
#2     2  13000000  73050000   43025000.
#3     4        NA        NA         NA
#4     5  10500000        NA    8000000.
#5     6 331000000 659000000 1040000000.
#6     7  55500000        NA   32032920.
#7     8        NA        NA   20000000.
#8     9   2521880   5061370    7044288.

Sample data

df <- read_table("Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        NA              NA
4       NA              NA              NA
5       NA              NA              8000000
6       331000000       659000000       1040000000
7       NA              NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288")
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68