-2

I have two dataframes and they both have the exact same column names, however the data in the columns is different in each dataframe. I am trying to join the two frames (as seen below) by a full join. However, the hard part for me is the fact that I have to rename the columns so that the columns corresponding to my one dataset have some text added to the end while adding different text to the end of the columns that correspond to the second data set.

combined_df <- full_join(any.drinking, binge.drinking, by = ?)

A look at one of my df's:

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
FrankRoss
  • 47
  • 3
  • Let me know if any more info is needed, really lost here... – FrankRoss May 06 '21 at 06:13
  • Hi :) have you tried the `suffix` argument of `dplyr::full_join()` function ? There is also something similar with `merge()` – Paul May 06 '21 at 06:15
  • @Paul Hey there, I tried to do this 'combined_df <- full_join(any.drinking, binge.drinking, suffix = c("_any", "_binge"))" and saw little change. It is just stacking the two df's on top of each other with no added suffix. – FrankRoss May 06 '21 at 06:20
  • oh ok, maybe `full_join` or `merge()` are not the way to got. Maybe `dplyr::bind_cols()` or `cbind()` will do the trick – Paul May 06 '21 at 06:24
  • 1
    @Paul I used cbind() and it produced the correct output (two dataframes side by side) However, the issue is that I have to add the suffix to each column so they can be differentiated between frames but I'm not sure to do that at a mass level. – FrankRoss May 06 '21 at 06:28
  • I am not sure about what you expect to get in the end. I suppose you have some ID column that are used to merge your data. Could you please provide a [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) example and the expected output? It could help a lot :) – Paul May 06 '21 at 06:34
  • Is [this](https://stackoverflow.com/a/30613714/10264278) close to what you want? – Paul May 06 '21 at 06:45
  • @Paul Not necessarily. If I use cbind I can't add the "_any" and "_binge" suffixes to the duplicate columns in the two dataframes that I combined. Unless the rename() function is able to rename several columns at once? – FrankRoss May 06 '21 at 07:01
  • nothing to do with IDE so Rstudio tag removed – AnilGoyal May 06 '21 at 07:13
  • Please see an idea posted as a possible solution. – Paul May 06 '21 at 07:39

1 Answers1

0

Without custom function and shorter:

df <- cbind(cars, cars)
colnames(df) <- c(paste0(colnames(cars), "_any"), paste0(colnames(cars), "_binge"))

Output:

> head(df)
  speed_any dist_any speed_binge dist_binge
1         4        2           4          2
2         4       10           4         10
3         7        4           7          4
4         7       22           7         22
5         8       16           8         16
6         9       10           9         10

Certainly not the most elegant way but maybe it is what you want:

custom_bind <- function(df1, suffix1, df2, suffix2){
  
  colnames(df1) <- paste(colnames(df1), suffix1, sep = "_")
  colnames(df2) <- paste(colnames(df2), suffix2, sep = "_")
  
  df <- cbind(df1, df2)
  return(df)
}

custom_bind(cars, "any", cars, "binge")

I made it as a function in case you want to do it with other tables. If not then it is not necessary.

Output:

> head(custom_bind(cars, "any", cars, "binge"))
  speed_any dist_any speed_binge dist_binge
1         4        2           4          2
2         4       10           4         10
3         7        4           7          4
4         7       22           7         22
5         8       16           8         16
6         9       10           9         10
Paul
  • 2,850
  • 1
  • 12
  • 37