1

I want to combine these two dataframes in to one. I want to have the MCN_ID's stacked vertically and the UNS 1930 column to only be one column, not "UNS 1930.x" and "UNS 1930.y" which is what my code is currently resulting in. This is ultimately being done a large scale with multiple loops for this merge. Please help!

test1 <- data.frame("AABB", 1)
colnames(test1)[1] <- "MCN_ID"
colnames(test1)[2] <- "UNS 1930"

test2 <- data.frame("BBAA", 23)
colnames(test2)[1] <- "MCN_ID"
colnames(test2)[2] <- "UNS 1930"

test3 <- full_join(test1, test2, by = "MCN_ID")

Gives this result:

MCN_ID   UNS 1930.x   UNS 1930.y
AABB     1            NA  
BBAA     NA           23 

But I want this:

MCN_ID   UNS 1930
AABB     1  
BBAA     23 
r2evans
  • 141,215
  • 6
  • 77
  • 149
katiebrown
  • 13
  • 2
  • 2
    You need `rbind` or `bind_rows(test1, test2)` instead of join – akrun Jun 08 '21 at 19:29
  • Try using `?merge` and choosing by.x and by.y as the colnames you want –  Jun 08 '21 at 19:29
  • 1
    FYI, `colnames(test1) <- c("MCN_ID", "UNS 1930")`. Perhaps even `test1 <- data.frame(MCN_ID="AABB", "UNS 1930"=1, check.names=FALSE)`. – r2evans Jun 08 '21 at 19:34
  • Does this answer your question? [R - Concatenate two dataframes?](https://stackoverflow.com/questions/8169323/r-concatenate-two-dataframes) – camille Jun 09 '21 at 03:17

2 Answers2

0

Simply use: test3 <- rbind(test1, test2)

Rory S
  • 1,278
  • 5
  • 17
0

You can use bind_rows from dplyr. This will work even with columns out of order which is a nice extra vs rbind:

library(dplyr)
testout <- bind_rows(test1,test2)

> testout
  MCN_ID UNS 1930
1   AABB        1
2   BBAA       23
Neal Barsch
  • 2,810
  • 2
  • 13
  • 39
  • This is perfect, thank you! This was just a simple example of a more complicated process and this is exactly what I needed. Thanks so much. – katiebrown Jun 10 '21 at 11:41
  • Okay so I am running multiple for loops through multiple dataframes and it comes out like this: (type this into your R to see the output) `testdf <- data.frame(c("AABB", "AABB","AABB","AACC","AACC","AACC","AADD","AADD","AADD"),c(0,"NA","NA",1,"NA","NA",1,"NA","NA"),c("NA",0,"NA","NA",1,"NA","NA",2,"NA"),c("NA","NA",0,"NA","NA",1,"NA","NA",2))` Instead, I want it to be (type this in) `testdf <- data.frame(c("AABB","AACC","AADD"),c(0,1,1),c(0,1,2),c(0,1,2))` so that there are no duplicates and everything is found in the same row – katiebrown Jun 10 '21 at 13:32
  • 1
    Further Update: I added the following line to my code `Result_df <- Result_df %>% group_by(MCN_ID) %>% summarise_each(funs(first(na.omit(.))))` which seemed to do the trick – katiebrown Jun 10 '21 at 13:46
  • Just as a note, the way you created your tables you factorized them. Add ```stringsAsFactors=FALSE``` to keep as strings and numeric in the ```testdf <- data.frame() ``` – Neal Barsch Jun 14 '21 at 19:49