1

I want to merge these two column vectors but with the condition to apply the smallest number to the df1. I know this is not clear at all so you can hopefully see what I mean below.

df1    df2            
ID     Age         
1      10
1      9
1      50 
2      24
3      30
3      2

I want to use these to create a new data frame that looks like this

new_df            
ID     (Youngest) Age          
1      9
1      9
1      9
2      24
3      2
3      2

More clearly I wanna pick out the smallest number for each unique ID from df1 and apply this to all rows of the corresponding ID. I am just hugely stuck on how to do with this with my limited R knowledge of merge() is not doing much for me.

Werner Hertzog
  • 2,002
  • 3
  • 24
  • 36
lukey
  • 13
  • 3

3 Answers3

2
library(tidyverse)

df1 <- data.frame(ID = c(1, 1, 1, 2, 3, 3))
df2 <- data.frame(Age = c(10, 9, 50, 24, 30, 2))

df1 %>%
  cbind(., df2) %>%
  group_by(ID) %>%
  mutate(Age_new = min(Age))

which gives:

# A tibble: 6 x 3
# Groups:   ID [3]
     ID   Age Age_new
  <dbl> <dbl>   <dbl>
1     1    10       9
2     1     9       9
3     1    50       9
4     2    24      24
5     3    30       2
6     3     2       2
deschen
  • 10,012
  • 3
  • 27
  • 50
0

An option with base R

df3 <- cbind(df1, df2)
df3$Age_new <- with(df3, ave(Age, ID, FUN = min))

data

df1 <- data.frame(ID = c(1, 1, 1, 2, 3, 3))
df2 <- data.frame(Age = c(10, 9, 50, 24, 30, 2))
akrun
  • 874,273
  • 37
  • 540
  • 662
0

It doesn't look like you actually have 2 data frames, but 2 vectors, or 2 columns of a data frame.
One way you could accomplish what it looks like you are trying to do is use the summarize function from dplyr package.

library(dplyr)

df <- cbind(df1, df2) ## combine your two vectors into a data frame

new_df <- df %>%  ## make a new object from your comined dataframe use it as first argumenet below
  group_by(ID)  %>% 
  summarize("(Youngest) Age" = min(Age)) 

new_df

Brian Fisher
  • 1,305
  • 7
  • 17
  • 1
    Technically, this gives you only a subset of the data, i.e. the rows with the minimum ages. However, the TO wants to have the minimum age added to all cases in the data. – deschen Nov 30 '20 at 22:13