3

Hi everybody I am working with a list of data frames in R that have the same variable names for all. The structure of my list is the next, I only include five elements to make it reproducible but I can have more than 20 elements in the list:

list

$a1 
ID      G
00001   A
00002   A
00003   B
00004   C
00005   D
00006   A

$a2 
ID      G
00001   A
00002   A
00003   B
00004   C
00005   D
00006   A
00007   A

$a3 
ID      G
00001   A
00002   A
00003   B
00004   C
00005   D
00006   A
00007   A
00008   B

$a4 
ID      G
00001   A
00002   A
00003   B
00004   C
00005   D
00006   A
00007   A
00008   B
00009   C

$a5 
ID      G
00001   A
00002   A
00003   B
00004   C
00005   D
00006   A
00007   A
00008   B
00009   C
00010   D

Where the names of elements are a1, a2, a3, a4 and a5. My problem starts when I merge all elements because I can't establish a difference between merged variables. For example I apply to list next code to merge it: Merged=Reduce(function(x, y) merge(x, y,all.x=T,by=1),list) and I got this for Merged

ID     G.x  G.y G.x G.y G
00001   A   A   A   A   A
00002   A   A   A   A   A
00003   B   B   B   B   B
00004   C   C   C   C   C
00005   D   D   D   D   D
00006   A   A   A   A   A

And this warnings:

Warnings:
1: In merge.data.frame(x, y, all.x = T, by = 1) :
  column names ‘G.x’, ‘G.y’ are duplicated in the result
2: In merge.data.frame(x, y, all.x = T, by = 1) :
  column names ‘G.x’, ‘G.y’ are duplicated in the result

The merge is fine but I can't difference between merged variables because they have the same names. I would like to difference them for example first g.x is group from a1, first g.y is group from a2, second g.x is group from a3, second g.y is group from a4 and g is group from a5. I want to difference g considering the element where it comes, and I would like a structure like this:

    ID     G.1  G.2 G.3 G.4 G.5
    00001   A   A   A   A   A
    00002   A   A   A   A   A
    00003   B   B   B   B   B
    00004   C   C   C   C   C
    00005   D   D   D   D   D
    00006   A   A   A   A   A

Where clearly I can difference from what data frame comes each G or at least I would like something where I can make this difference. Thanks for your help.

Duck
  • 39,058
  • 13
  • 42
  • 84
  • Please make a reproducible example. http://stackoverflow.com/questions/5963269 Probably `do.call(data.frame,yourlist)` or `do.call(cbind,yourlist)` will get you close to what you're after. – Frank Nov 09 '13 at 17:39
  • Dear @Frankyour solution gives me error because all elements in `list` has different number of rows. Any suggestion – Duck Nov 09 '13 at 17:45
  • Hm, well, if you really want to throw out the rows beyond the sixth, I guess your current approach is the simplest, followed by Simon's solution. There are other ways, of course. – Frank Nov 09 '13 at 17:51
  • Dear @Frank the solution of Simon works fine and that is the simple way to solve it, thanks for your help I was ommiting something and I get wrong. But adicional little question merge function respect order in elements when they are merged for example a1 merged a2, after a3, etc. Of this form the names would be in correct order. – Duck Nov 09 '13 at 20:19
  • 1
    Glad you've found a solution. Hm, I don't understand your additional question. You should probably post a separate one on SO if you don't figure it out. I was saying you should `dput` it because you had follow-up questions for Simon that he would've been able to address initially if you had `dput` the five-element list used in your example (with factors or whatever other features are relevant) instead of just pasting what it looks like in the console. – Frank Nov 09 '13 at 20:22
  • No problem @Frank my next question will have a `dput()` version of my data, of this form more people can help me. – Duck Nov 10 '13 at 00:46

1 Answers1

1

setNames will be very handy for this...

setNames( Merged , c( "ID" , names( list ) ) )
#     ID a1 a2 a3 a4 a5
#1 00001  A  A  A  A  A
#2 00002  A  A  A  A  A
#3 00003  B  B  B  B  B
#4 00004  C  C  C  C  C
#5 00005  D  D  D  D  D
#6 00006  A  A  A  A  A
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
  • Dear @SimonO101 your solution don't give me the same result that you show – Duck Nov 09 '13 at 17:55
  • Dear @SimonO101 your solution works only data frames whose variables are strings, I apply your solution in a data frame with factor variables and It doesn't work, any solution for factors. – Duck Nov 09 '13 at 18:19
  • @Duck Really, what is preventing you from `dput`ting your object? – Frank Nov 09 '13 at 18:30
  • Dear @FrankI am working with a so large list, 90000 rows for 11 data frames in it, i couldn't `dput` it maybe I could send it to you or add it to dropbox – Duck Nov 09 '13 at 18:35