1

I am trying to add rows to a dataframe based on the value in a particular column. The added rows will be duplicates of the rows already in the dataframe, with the number of rows to add based on the value of "cat_count" for that particular row. Here's an small-scale example:

factor1 = c("1", "2", "3", "4")
group = c("Group_A", "Group_B", "Group_C", "Group_D")
cat_count = c(3,5,1,8)

df <- data.frame(factor1, group, cat_count)

view(df)
 factor1   group cat_count
1       1 Group A         3
2       2 Group B         5
3       3 Group C         1
4       4 Group D         8

The output should to look like is this ...

view(df)
 factor1   group cat_count
1       1 Group A         3
2       1 Group A         3
3       1 Group A         3
4       2 Group B         5
5       2 Group B         5
6       2 Group B         5
7       2 Group B         5
8       2 Group B         5
9       3 Group C         1
10      4 Group D         8
11      4 Group D         8
12      4 Group D         8
13      4 Group D         8
14      4 Group D         8
15      4 Group D         8
16      4 Group D         8
17      4 Group D         8

I've looked at this answer, R: how to add rows based on the value in a column. It's not quite what I need, but I think it's heading in the right direction.

Does anyone have suggestions? Thanks.

CJF
  • 25
  • 3

1 Answers1

2

We may use uncount from tidyr

library(tidyr)
library(dplyr)
df %>%
    uncount(cat_count, .remove = FALSE)

-output

    factor1   group cat_count
1        1 Group_A         3
2        1 Group_A         3
3        1 Group_A         3
4        2 Group_B         5
5        2 Group_B         5
6        2 Group_B         5
7        2 Group_B         5
8        2 Group_B         5
9        3 Group_C         1
10       4 Group_D         8
11       4 Group_D         8
12       4 Group_D         8
13       4 Group_D         8
14       4 Group_D         8
15       4 Group_D         8
16       4 Group_D         8
17       4 Group_D         8

If there are 0 values, we may need to change it to 1 if the intention is to keep that row

df %>%
   mutate(cat_count2 = replace(cat_count, cat_count == 0, 1)) %>%   
   uncount(cat_count2)
akrun
  • 874,273
  • 37
  • 540
  • 662