How to add a column with values corresponding to another column?

Question

I will summarize how I got the dataframes I work with:

     name abundance 
1    joe  1
2    tim  1
3    bob  1
4    joe  1 
5    bob  1

First I created a new dataframe by aggregating the columns and calculated what the relative freqeuency would be:

     name  abundance  relative_ab
1    joe   2          0.4
2    tim   1          0.2
3    bob   2          0.4

But I want to add a column to the first dataframe so that there are redundant entries with the relative_ab like so (the actual data set has other information in it and I would lose the information by aggregating them).

     name abundance relative_ab
1    joe  1         0.4
2    tim  1         0.2
3    bob  1         0.4
4    joe  1         0.4
5    bob  1         0.4

I think I could brute force this but I am relatively new to R and wondering what slick ways you guys might come up with.

Thanks!

`merge(df1, df2, by="name")` – HubertL Jun 07 '17 at 00:50 — HubertL, Jun 07 '17 at 00:50

score 2 · Answer 1 · answered Jun 07 '17 at 00:55

If you can use dplyr:

library(dplyr)
df %>% 
  mutate(s=sum(abundance)) %>%
  group_by(name) %>%
  mutate(relative_ab=sum(abundance)/s, s=NULL)

    name abundance relative_ab
  <fctr>     <int>       <dbl>
1    joe         1         0.4
2    tim         1         0.2
3    bob         1         0.4
4    joe         1         0.4
5    bob         1         0.4

score 0 · Answer 2 · answered Jun 07 '17 at 00:52

You can do this with match. Assuming your first data.frame is df1 and the second one is df2, you can use:

df1$relative_ab = df2$relative_ab[match(df1$name, df2$name)]
df1
  name abundance relative_ab
1  joe         1         0.4
2  tim         1         0.2
3  bob         1         0.4
4  joe         1         0.4
5  bob         1         0.4

match uses the name to select which row to use.

match(df1$name, df2$name)
[1] 1 2 3 1 3

score 0 · Answer 3 · answered Jun 07 '17 at 01:52

We can do this with base R ave grouping by name and divide the group abundance sum by the complete abundance.

df$relative_ab <- with(df,ave(abundance, name, FUN = function(x) 
                                        sum(x)/sum(abundance)))
df
#  name abundance relative_ab
#1  joe         1         0.4
#2  tim         1         0.2
#3  bob         1         0.4
#4  joe         1         0.4
#5  bob         1         0.4

score 0 · Answer 4 · answered Jun 07 '17 at 04:00

We can do this with data.table

library(data.table)
setDT(df)[, relative_ab := sum(abundance)/sum(df$abundance) , name]
df
#   name abundance relative_ab
#1:  joe         1         0.4
#2:  tim         1         0.2
#3:  bob         1         0.4
#4:  joe         1         0.4
#5:  bob         1         0.4

How to add a column with values corresponding to another column?

4 Answers4