0

I have a df:

df= data.frame(year=c(rep(2018,4),rep(2017,3)),Area=c(1:4,1:3),P=1:7,N=1:7)

I want to split it by years, and then merge everything together again so I can see years as columns for each area. In order to do this, I am splitting and merging:

s=split(df,df$year)
m=merge(s[[1]][,2:4],[s[[2]][,2:4],by='Area',all=1)
colnames(m)=c('area','P2018','C2018','P2017','C2017')

I am sure there is a more efficient way, expecially as the possibility for errors is very high once I include data from other years.

Any suggestions?

frank
  • 3,036
  • 7
  • 33
  • 65

2 Answers2

2

We can gather data to long form excluding year and Area column, unite the year and then spread it to wide format.

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -year, -Area) %>%
  unite(key, key, year, sep = "") %>%
  spread(key, value)

#  Area N2017 N2018 P2017 P2018
#1    1     5     1     5     1
#2    2     6     2     6     2
#3    3     7     3     7     3
#4    4    NA     4    NA     4
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can do this with dcast from data.table which can take multiple value.var columns

library(data.table)
dcast(setDT(df), Area ~ year, value.var = c("P", "N"))
#  Area P_2017 P_2018 N_2017 N_2018
#1:    1      5      1      5      1
#2:    2      6      2      6      2
#3:    3      7      3      7      3
#4:    4     NA      4     NA      4
akrun
  • 874,273
  • 37
  • 540
  • 662