Simplest way to get rbind to ignore column names

Question

This came up just in an answer to another question here. When you rbind two data frames, it matches columns by name rather than index, which can lead to unexpected behavior:

> df<-data.frame(x=1:2,y=3:4)
> df
  x y
1 1 3
2 2 4
> rbind(df,df[,2:1])
  x y
1 1 3
2 2 4
3 1 3
4 2 4

Of course, there are workarounds. For example:

rbind(df,rename(df[,2:1],names(df)))
data.frame(rbind(as.matrix(df),as.matrix(df[,2:1])))

On edit: rename from the plyr package doesn't actually work this way (although I thought I had it working when I originally wrote this...). The way to do this by renaming is to use SimonO101's solution:

rbind(df,setNames(df[,2:1],names(df)))

Also, maybe surprisingly,

data.frame(rbindlist(list(df,df[,2:1])))

works by index (and if we don't mind a data table, then it's pretty concise), so this is a difference between do.call(rbind).

The question is, what is the most concise way to rbind two data frames where the names don't match? I know this seems trivial, but this kind of thing can end up cluttering code. And I don't want to have to write a new function called rbindByIndex. Ideally it would be something like rbind(df,df[,2:1],byIndex=T).

Interesting. Though personally I'd rather that `rbind` on dataframes take the care to match named subsets (i.e. columns). When you don't want to "track" things by name, use `matrix` in the first place? — Carl Witthoft, Oct 10 '13 at 13:56
One (maybe the only) reason not to use `matrix` is because it doesn't allow mixing of types. — mrip, Oct 10 '13 at 14:13

score 59 · Accepted Answer · edited Aug 17 '16 at 01:08

59

You might find setNames handy here...

rbind(df, setNames(rev(df), names(df)))
#  x y
#1 1 3
#2 2 4
#3 3 1
#4 4 2

I suspect your real use-case is somewhat more complex. You can of course reorder columns in the first argument of setNames as you wish, just use names(df) in the second argument, so that the names of the reordered columns match the original.

edited Aug 17 '16 at 01:08

metasequoia

7,014
5
41
54

answered Oct 10 '13 at 14:02

Simon O'Hanlon

58,647
14
142
184

Thanks, I had been using `rename` from the `plyr` package, and now when I try rerunning my code from the OP, it doesn't work. – mrip Oct 10 '13 at 14:06
@mrip well I hope this is useful for your real use-case! – Simon O'Hanlon Oct 10 '13 at 14:07
@SimonO101 I like the simplicity of the code inside `setNames`. – Roland Oct 10 '13 at 14:10
This doesn't ignore it, does it? It just sets the names to be the same. – ifly6 Mar 08 '18 at 19:35

Thomas · Answer 2 · 2013-10-10T13:58:12.233

9

This seems pretty easy:

mapply(c,df,df[,2:1])
     x y
[1,] 1 3
[2,] 2 4
[3,] 3 1
[4,] 4 2

For this simple case, though, you have to turn it back into a dataframe (because mapply simplifies it to a matrix):

as.data.frame(mapply(c,df,df[,2:1]))
  x y
1 1 3
2 2 4
3 3 1
4 4 2

Important note 1: There appears to be a downside of type coercion when your dataframe contains vectors of different types:

df<-data.frame(x=1:2,y=3:4,z=c('a','b'))
mapply(c,df,df[,c(2:1,3)])
     x y z
[1,] 1 3 2
[2,] 2 4 1
[3,] 3 1 2
[4,] 4 2 1

Important note 2: It also is terrible if you have factors.

df<-data.frame(x=factor(1:2),y=factor(3:4))
mapply(c,df[,1:2],df[,2:1])
     x y
[1,] 1 1
[2,] 2 2
[3,] 1 1
[4,] 2 2

So, as long as you have all numeric data, it's okay.

edited Oct 10 '13 at 13:58

answered Oct 10 '13 at 13:52

Thomas

43,637
12
109
140

3

But you'll get into trouble if you have different data types in the df. – Roland Oct 10 '13 at 13:54
@Roland Yup, I just edited to that effect. Anyway around that? – Thomas Oct 10 '13 at 13:59
1

Yes, write an `rbindByIndex` function, which the OP explicitly doesn't want to do ... – Roland Oct 10 '13 at 14:02
If you don't have factors, you can get this to work with mixed types by `data.frame(mapply(c(df,df[,c(2,1,3)]),SIMPLIFY=F))`, but then it's not as nice and concise. – mrip Oct 10 '13 at 17:18
1

Or use `Map` which is `mapply(...,SIMPLIFY=FALSE)` - `data.frame(Map(c,df,df[,2:1]))` – thelatemail Oct 11 '18 at 02:41

Simplest way to get rbind to ignore column names

2 Answers2

Linked

Related