0

I am trying to join dataframe bar to foo based on a common column and then keep the original column order in foo:

> head(foo)
             city  course1
1         Aalborg    JEMES
2          Aarhus  EM-SANF
3 Aix-en-Provence     EMLE
4          Almaty  IMRCEES
5          Alnarp SUFONAMA
6       Amsterdam   ATOSIM

> colnames(foo)
[1] "city"    "course1"


> head(bar)
      code                                     website
1  4CITIES                      http://www.4cities.eu/
2     ACES          http://www.sams.ac.uk/aces-erasmus
3 ADVANCES          http://www.socialworkadvances.org/
4    AMASE                http://www.amase-master.net/
5  ARCHMAT http://www.erasmusmundus-archmat.uevora.pt/
6      ASC     

             http://www.master-asc.org/
> colnames(bar)
[1] "code"    "website"

The join column is course in foo and code in bar. I have used the following formula:

test <- merge(x = foo, y = bar, by.x = "course1", by.y = "code", all.x=TRUE)[, union(names(foo), names(bar))]

This fails and produces the following error message:

Error in `[.data.frame`(merge(x = foo, y = bar, by.x = "course1", by.y = "code",  : 
  undefined columns selected

I have found this solution here but it is not working, even though none of the column names are duplicated. What could be the problem?

A simple join works (without reordering) but brings the join column to the front:

> head(test)
  course1       city                   website
1   JEMES    Aalborg http://www.jemes-cisu.eu/
2   JEMES     Aveiro http://www.jemes-cisu.eu/
3   JEMES    Hamburg http://www.jemes-cisu.eu/
4 EM-SANF     Aarhus  http://www.emsanf.eu/UK/
5 EM-SANF Wageningen  http://www.emsanf.eu/UK/
6 EM-SANF   Debrecen  http://www.emsanf.eu/UK/

I have tried adding sort = F and removing all.x = TRUE, but this does not work. The problem is that my actual dataframes have many more columns and will be going through multiple joins so I would like to preserve column order in all in one function. Is there a known effective workaround or a package that preserves column order in joins?

> names(test)
[1] "course1" "city"    "website"
> names(foo)
[1] "city"    "course1"
> names(bar)
[1] "code"    "website"
Community
  • 1
  • 1
iskandarblue
  • 7,208
  • 15
  • 60
  • 130

1 Answers1

1

Your reindexing ([,union(names(foo), names(bar))]) is to blame: because names(bar) has "code" which is not present, you'll get an indexing error. Here is the corrected code:

allnames <- union(names(foo), recode(names(bar), code = "course1"))
merge(foo, bar, by.x = "course1", by.y = "code", all.x = TRUE)[,allnames]

Since your example isn't reproducible (the merge here is empty, nothing in common), I'll demonstrate with a modified structure:

foo <- structure(list(city = c("Aalborg", "Aarhus", "Aix-en-Provence", 
    "Almaty", "Alnarp", "Amsterdam"), course1 = c("JEMES", "EM-SANF", 
    "EMLE", "IMRCEES", "SUFONAMA", "ATOSIM")), .Names = c("city", 
    "course1"), class = "data.frame", row.names = c(NA, -6L))
bar <- structure(list(code = c("4CITIES", "ACES", "ADVANCES", "AMASE", 
    "ARCHMAT", "ASC"), website = c("http://www.4cities.eu/", "http://www.sams.ac.uk/aces-erasmus", 
    "http://www.socialworkadvances.org/", "http://www.amase-master.net/", 
    "http://www.erasmusmundus-archmat.uevora.pt/", "http://www.master-asc.org/"
    )), .Names = c("code", "website"), row.names = c(NA, 6L), class = "data.frame")

and now ensure that the values in bar$code are present in foo$course:

set.seed(42)
bar$code <- sample(foo$course1)

The result:

allnames <- union(names(foo), recode(names(bar), code = "course1"))
merge(foo, bar, by.x = "course1", by.y = "code", all.x = TRUE)[,allnames]
#              city  course1                                     website
# 1       Amsterdam   ATOSIM                      http://www.4cities.eu/
# 2          Aarhus  EM-SANF          http://www.socialworkadvances.org/
# 3 Aix-en-Provence     EMLE                http://www.amase-master.net/
# 4          Almaty  IMRCEES http://www.erasmusmundus-archmat.uevora.pt/
# 5         Aalborg    JEMES                  http://www.master-asc.org/
# 6          Alnarp SUFONAMA          http://www.sams.ac.uk/aces-erasmus
r2evans
  • 141,215
  • 6
  • 77
  • 149