2

I have a grouped df and I would like to add additional rows to the top of the groups that match with a variable (item_code) from the df. The additional rows do not have an id column. The additional rows should not be duplicated within the groups of df.

Example data:

df <- as.tibble(data.frame(id=rep(1:3,each=2),
                     item_code=c("A","A","B","B","B","Z"),
                     score=rep(1,6)))



additional_rows <- as.tibble(data.frame(item_code=c("A","Z"),
                                        score=c(6,6)))

What I tried

I found this post and tried to apply it: Add row in each group using dplyr and add_row()

df %>% group_by(id) %>% do(add_row(additional_rows %>%
                                     filter(item_code %in% .$item_code)))

What I get:

# A tibble: 9 x 3
# Groups:   id [3]
     id item_code score
  <int> <fct>     <dbl>
1     1 A             6
2     1 Z             6
3     1 NA           NA
4     2 A             6
5     2 Z             6
6     2 NA           NA
7     3 A             6
8     3 Z             6
9     3 NA           NA

What I am looking for:

# A tibble: 6 x 3
     id item_code score
  <int> <fct>     <dbl>
1     1 A             6
2     1 A             1
3     1 A             1
4     2 B             1
5     2 B             1
6     3 B             1
7     3 Z             6
8     3 Z             1 
captcoma
  • 1,768
  • 13
  • 29

1 Answers1

1

This should do the trick:

 library(plyr)

 df %>%
   join(subset(df, item_code %in% additional_rows$item_code, select = c(id, item_code)) %>%
        join(additional_rows) %>% 
        subset(!duplicated(.)), type = "full") %>%
   arrange(id, item_code, -score)

Not sure if its the best way, but it works

Edit: to get the score in the same order added the other arrange terms

Edit 2: alright, there should now be no duplicated rows added from the additional rows as per your comment

morgan121
  • 2,213
  • 1
  • 15
  • 33
  • it works for the sample data true. however, the real data `additional_rows` has more than only `item_code` `A` that matches the `df`. – captcoma Nov 27 '18 at 12:10
  • Can you add the full data so I can see if I can modify it? – morgan121 Nov 27 '18 at 12:13
  • Also do you always want to exclude Z from the output? That is a bit limiting, and I'm not sure what the purpose of this is – morgan121 Nov 27 '18 at 12:15
  • I modified the example data. Only rows with matching `item_code` should be added – captcoma Nov 27 '18 at 12:21
  • 1
    Edited to make sure all matching ones are kept. Does that work with the full data? – morgan121 Nov 27 '18 at 12:27
  • Almost. I guess my example was not good enough and I edited it a last time. join created duplicates of the `additional_rows`, I was actually looking for a solution that only takes the available rows from `additional_rows` and adds them to the groups of the `df`, without generating duplicate rows. – captcoma Nov 27 '18 at 13:10
  • 1
    OK, I have made one more adjustment. Can you check if it works now? – morgan121 Nov 27 '18 at 21:48