2

I hope someone can help me, I have been stuck in this for a while although it does not seem so difficult to solve.

I am having some problems to copy and paste values from a data frame to another, within a loop of data frame creation.

I am using string values because I have to create many data frames. The problem occur in the last line set(get(paste0(letters[i],"_id")), j = 1L, value = Values_df[,i]) it is copying and pasting the last column Values_df[C] to the first column of A_id and B_id. What I want instead would be the respective values of their columns in the data frame Values_df copied to the first columns of the "id" data frames.

Toy example:

rm(list = ls())
farms<-c("farm1","farm2","farm3","farm4")
qys<-expand.grid(c("Q1","Q2","Q3","Q4"),sprintf("Y%s", seq(1:10)))
qys<-paste(qys$Var2,qys$Var1)
basedata<-data.frame(matrix(NA,nrow=length(farms),ncol=length(qys)))
row.names(basedata)<-farms
colnames(basedata)<-qys
letters<-c("A","B","C")
Values_df<-data.frame(matrix(rexp(12, rate=.1), ncol=length(letters), nrow=length(farms)))
colnames(Values_df)<-letters
rownames(Values_df)<-farms
library(data.table)
for (i in 1:length(letters)){
  assign(paste0(letters[i],"_id"),basedata)
  set(get(paste0(letters[i],"_id")), j = 1L, value = Values_df[,i]) #PROBLEM
}

The desired output is:

A_id[,1]<-Values_df[,1]
B_id[,1]<-Values_df[,2]
C_id[,1]<-Values_df[,3]

but I am getting:

A_id[,1]<-Values_df[,3]
B_id[,1]<-Values_df[,3]
C_id[,1]<-Values_df[,3]
understorey
  • 124
  • 1
  • 10
  • 2
    i'm not clear on your desired output. anyway, cluttering your namespace with loads of tiny tables is probably the wrong way to do this. more natural to me is to stack all of the data into one long data set and keep one column tracking which `id` – MichaelChirico Jun 29 '18 at 10:29
  • I just edit the question with the desired output and what I am getting with my code. Thanks for the input. – understorey Jun 29 '18 at 10:37
  • 1
    `letters` is a built-in vector; probably best not to overwrite it (to avoid confusion). Anyway, I agree with Michael; you can/should do something more like `rbindlist(lapply(setNames(1:3, letters[1:3]), function(x) data.table(basedata, keep.rownames = TRUE)[, (2) := Values_df[[x]]]), id="let")` instead of using `assign` and `get`. If you really want to do it your way, using `copy(basedata)` in place of `basedata` should work as Emmanuel answered. – Frank Jun 29 '18 at 14:32

1 Answers1

2

I'm not quite sure why (i suspect assign), but it seems that your data.frame (A_id, B_id...) are linked in are not different, they are just different names pointing to the same object in RAM.

A work around is to use data.table::copy to make a copy in RAM of the object.

for (i in 1:length(letters)){
  assign(paste0(letters[i],"_id"), copy(basedata))
  set(get(paste0(letters[i],"_id")), NULL,j = 1L, value = Values_df[,i]) #PROBLEM
}

NB: It will solve your problem, but as @MichaelChirico said cluttering your namespace with loads of tiny tables is probably the wrong way to do this.

References: As suggested by @○Frank, here is a reference on copy versus reference of data.table objects.

Emmanuel-Lin
  • 1,848
  • 1
  • 16
  • 31