0

I have a series of variable-specific data.frames like the following:

Variable 1:

LOCATION 2014.01  2014.02  2014.03  2014.04
902010   7        -3       14       5

Variable 2:

LOCATION 2014.01  2014.02  2014.03  2014.04
902010   9        11       6        3

I'd like to transform transform and combine them into the following format.

LOCATION    DATE     VARIABLE1  VARIABLE2
902010      2014.01  7          9
902010      2014.02  -3         11
902010      2014.03  14         6
902010      2014.04  5          3

I think the right approach is to transform one data.frame then add data from the others where date and location match.

The answers to similar-sounding questions recommend the library reshape2 or the function t(), but I'm having a hard time understanding how to use them for this scenario. I've also seen recommendations for the time-series library zoo.

Thalecress
  • 3,231
  • 9
  • 35
  • 47

4 Answers4

1

You should:

  1. reshape your data in the long format using melt
  2. combine the data.frames by (LOCATION and dates) using merge

Here a complete solution: library(reshape2)

dat1 <- 'LOCATION 2014.01  2014.02  2014.03  2014.04
902010   7        -3       14       5'

dat2 <- 'LOCATION 2014.01  2014.02  2014.03  2014.04
902010   9        11       6        3'

ll <- lapply(list(dat1,dat2),function(x){
dat <- read.table(text=x,header=TRUE)
melt(dat,id.vars ='LOCATION')
})

res <- merge(ll[[1]],ll[[2]],by=1:2)

# LOCATION variable value.x value.y
# 1   902010 X2014.01       7       9
# 2   902010 X2014.02      -3      11
# 3   902010 X2014.03      14       6
# 4   902010 X2014.04       5       3

An extra step is to coerce you variable column to a valid date.

res$variable <- 
as.Date(paste0(sub('X','',res$variable),'.01'),
        format='%Y.%m.%d')
agstudy
  • 119,832
  • 17
  • 199
  • 261
1

1) Define a merge function, merge12 which merges by the first two variables and a melt function, melt1, which uses the first variable as the id. Then define L, a list of data frames, and run Reduce like this:

library(reshape2)

merge12 <- function(..., by = 1:2) merge(..., by = by)
melt1 <- function(..., id = 1) melt(..., id = id)

L <- list(DF1, DF2)
Reduce(merge12, lapply(L, melt1))

For the data frames in the question this gives the following

  LOCATION variable value.x value.y
1   902010  2014.01       7       9
2   902010  2014.02      -3      11
3   902010  2014.03      14       6
4   902010  2014.04       5       3

Note: This works for more than two data frames as well. Just put them all in the list L.

2) fn$ Using fn$ in gsubfn which allows function arguments to be expressed as formulas we can write the above in more compact way:

library(reshape2)
library(gsubfn)

L <- list(DF1, DF2)
fn$Reduce(~ merge(..., by = 1:2), fn$lapply(L, ~ melt(..., id = 1)))

giving the same result.

ADDED (2)

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

have you tried merge() function?

You might also find it useful! Merge or combine by rownames

Community
  • 1
  • 1
Alex Kors
  • 182
  • 1
  • 4
  • 10
0

Here's one way to get there using base R functions.

> cb <- cbind(var1[1], names(var1)[-1], sapply(list(var1[-1], var2[-1]), t))
> names(cb)[2:4] <- c("DATE", "VAR1", "VAR2") 
> cb
#   LOCATION     DATE VAR1 VAR2
# 1   902010 X2014.01    7    9
# 2   902010 X2014.02   -3   11
# 3   902010 X2014.03   14    6
# 4   902010 X2014.04    5    3

And if you don't mind the columns being in a different order, it can be shortened to three columns,

> rb <- rbind(var1 = var1, var2 = var2)
> cbind(t(rb[-1]), LOC = var1$LOCATION)
#          var1 var2    LOC
# X2014.01    7    9 902010
# X2014.02   -3   11 902010
# X2014.03   14    6 902010
# X2014.04    5    3 902010
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245