2

So I have many data frames and I'm trying to merge them. Some of them are in the form:

    sites1  AA1 SA
1   13: C   0.360828
2   14: S   0.017064
3   15: I   0.010810

Others are:

    sites2  AA2 Freq
1   1:  X   0.013
2   1:  S   0.987
3   2:  L   1.000

I have another data frame linking the proper data frame from the first set with the one from the second set and it goes like this:

    V1  V2
1   1JH6    AT4G18930
2   3MXZ    AT2G30410

with the name on the left side corresponding to one data frame and the name on the right side corresponding to another data frame. I'm trying to merge them by doing

for (i in 1:n){
  name = paste("1",names2[i,2])
  assign(name,merge(names2[i,1],names2[i,2]))
}

but this just returns a data frame with the two names.. Any help?

  • 4
    can you please dput your data structure? http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – agstudy Dec 03 '12 at 04:18
  • So I'm new to this and I'm not really sure how the dput thing works, the file is pretty big though, is there anyway to post it? – user1871524 Dec 03 '12 at 04:32
  • it's not clear how the `names2` data frame relates table 1 to table 2. – Ricardo Saporta Dec 03 '12 at 04:32
  • @user1871524, for just type in `dput( yourDataFrame )` and copy+paste the output. Please repeat for each relevant data frame. – Ricardo Saporta Dec 03 '12 at 04:33
  • the output is too large for me to post here. for the names in names2, the ones on the left side (names2[i,1]) are the names of tables like table 1 and on the right side are the names of tables like table 2 – user1871524 Dec 03 '12 at 04:35
  • @user1871524 if it's too big then dput( head( yourDataFrame ) ) plz ;) – Anthony Damico Dec 03 '12 at 05:10

1 Answers1

1

try replacing the assign statement inside your for loop with the following

     assign(name,merge(get(as.character(names2[i,1])), 
                       get(as.character(names2[i,2]))))

Also, consider fixing the name = paste.... statement as follows:

   name = paste("T1",names2[i,2], sep="")
   # added sep="" to not have a space.
   # changed the name so that does not start with a number
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
  • That seems to have helped a lot. Some of the new data frames that are made from the mergers though have up to 160000 items in them when the data frames from before max out at around 1000 items. Still though thanks so far – user1871524 Dec 03 '12 at 04:54
  • So I used the by.x and by.y (which weren't working before so I figured I'd abandon them entirely) and now everything seems to be working well. Thanks! – user1871524 Dec 03 '12 at 04:58
  • that's probably an issue with the specific call to merge, and not the for loop. I would recommend tracking down a few offenders, and investigating those specific tables. Then try manually `merge(...)` of just those tables (no for loop). – Ricardo Saporta Dec 03 '12 at 04:59