Adding/duplicating rows to a dataframe based on the value in a column

Question

I am trying to add rows to a dataframe based on the value in a particular column. The added rows will be duplicates of the rows already in the dataframe, with the number of rows to add based on the value of "cat_count" for that particular row. Here's an small-scale example:

factor1 = c("1", "2", "3", "4")
group = c("Group_A", "Group_B", "Group_C", "Group_D")
cat_count = c(3,5,1,8)

df <- data.frame(factor1, group, cat_count)

view(df)
 factor1   group cat_count
1       1 Group A         3
2       2 Group B         5
3       3 Group C         1
4       4 Group D         8

The output should to look like is this ...

view(df)
 factor1   group cat_count
1       1 Group A         3
2       1 Group A         3
3       1 Group A         3
4       2 Group B         5
5       2 Group B         5
6       2 Group B         5
7       2 Group B         5
8       2 Group B         5
9       3 Group C         1
10      4 Group D         8
11      4 Group D         8
12      4 Group D         8
13      4 Group D         8
14      4 Group D         8
15      4 Group D         8
16      4 Group D         8
17      4 Group D         8

I've looked at this answer, R: how to add rows based on the value in a column. It's not quite what I need, but I think it's heading in the right direction.

Does anyone have suggestions? Thanks.

akrun · Accepted Answer · 2021-08-19T21:50:38.320

2

We may use uncount from tidyr

library(tidyr)
library(dplyr)
df %>%
    uncount(cat_count, .remove = FALSE)

-output

    factor1   group cat_count
1        1 Group_A         3
2        1 Group_A         3
3        1 Group_A         3
4        2 Group_B         5
5        2 Group_B         5
6        2 Group_B         5
7        2 Group_B         5
8        2 Group_B         5
9        3 Group_C         1
10       4 Group_D         8
11       4 Group_D         8
12       4 Group_D         8
13       4 Group_D         8
14       4 Group_D         8
15       4 Group_D         8
16       4 Group_D         8
17       4 Group_D         8

If there are 0 values, we may need to change it to 1 if the intention is to keep that row

df %>%
   mutate(cat_count2 = replace(cat_count, cat_count == 0, 1)) %>%   
   uncount(cat_count2)

edited Aug 19 '21 at 21:50

answered Aug 19 '21 at 21:41

akrun

874,273
37
540
662

Perfect. Can this work in cases where the value in "cat_count" is 0? If not, can you suggest a fix? – CJF Aug 19 '21 at 21:48
@CJF For 0, value, what do you expect as output – akrun Aug 19 '21 at 21:48
@CJF Try the updated solution – akrun Aug 19 '21 at 21:52
1

Thanks. I can simply removed the rows with 0 as necessary, or your solution above is spot on. – CJF Aug 19 '21 at 22:36

Adding/duplicating rows to a dataframe based on the value in a column

1 Answers1