1

I have two dataframes. I want to replace the ids in dataframe1 with generic ids. In dataframe2 I have mapped the ids from dataframe1 with the generic ids.

Do I have to merge the two dataframes and after it is merged do I delete the column I don't want?

Thanks.

Ajaff
  • 73
  • 5
  • 1
    Please take a look at [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), to modify your question, with a smaller sample taken from your data (check `?dput()`). Posting images of your data or no data makes it difficult to impossible for us to help you! – massisenergy Mar 24 '20 at 18:24
  • Thanks. I just answered my own question. I merged the two dataframes and made sure the ID that was in dataframe1 and dataframe2 had the same column - so (merge(df1, df2, by="ID") it could be matched. I deleted the other column after it was merged correctly. – Ajaff Mar 24 '20 at 18:30

3 Answers3

1

We can use merge and then delete the ids.

dataframe1 <- data.frame(ids = 1001:1010, variable = runif(min=100,max = 500,n=10))
dataframe2 <- data.frame(ids = 1001:1010, generics = 1:10)
result <- merge(dataframe1,dataframe2,by="ids")[,-1]

Alternatively we can use match and replace by assignment.

dataframe1$ids <- dataframe2$generics[match(dataframe1$ids,dataframe2$ids)]
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
1

With dplyr

library(dplyr)
left_join(df1, df2, by = 'ids')
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Subsetting data frames isn't very difficult in R: hope this helps, you didn't provide much code so I hope this will be of help to you:

    #create 4 random columns (vectors) of data, and merge them into data frames:
a <- rnorm(n=100,mean = 0,sd=1)
b <- rnorm(n=100,mean = 0,sd=1)
c <- rnorm(n=100,mean = 0,sd=1)
d<- rnorm(n=100,mean = 0,sd=1)

df_ab <- as.data.frame(cbind(a,b))
df_cd <- as.data.frame(cbind(c,d))

#if you want column d in df_cd to equal column a in df_ab simply use the assignment operator
df_cd$d <- df_ab$a
#you can also use the subsetting with square brackets:
df_cd[,"d"] <- df_ab[,"a"]
Peter
  • 151
  • 1
  • 11