0

I'm trying to duplicate a row within a dataset and also retain rows that do not need duplicating.

Here is sample data:

library(tidyverse)
df <- data.frame(id = c('2292','2293','2294'), var1 = c('a', 'b', 'c'),
                   freq = c(1, NA, NA))

Before:

    id var1 freq
1 2292    a    1
2 2293    b    0
3 2294    c    0    

After:

      id var1 freq
1   2292    a    1
2 2292.1    a    1
3   2293    b    0
4   2294    c    0

I have looked at the following questions:

Repeat each row of data.frame the number of times specified in a column

However when following examples:

df %>% uncount(freq, .remove = FALSE)

I get:

    id var1 freq
1 2292    a    1

It would be very helpful to select by id and then duplicate selected id's whilst retaining rows that do not need to be duplicated and that I wish to keep.

I have also tried:

df %>% map_df(., rep, .$freq)

This comes close:

df %>% 
    filter(row_number() %in% c(1)) %>% 
    rbind.fill(df) %>%
    arrange(id)

Result:

    id var1 freq
1 2292    a    1
2 2292    a    1
3 2293    b    0
4 2294    c    0

But ideally I'd like to select rows by id instead of row_number and then update the id so that it becomes 2992.1. I can delete the freq column after. I am using tidyverse.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
JL_sey
  • 25
  • 4

1 Answers1

1

Why not filter these out first and then join. A suggested approach

#store id to duplicate in a vector

ids_v <- c(2292)

#Now filter out these and join

df %>% filter(id %in% ids_v) %>%
  mutate(id = paste0(id, '.1')) %>%
  rbind(df) %>% arrange(id)

      id var1 freq
1   2292    a    1
2 2292.1    a    1
3   2293    b   NA
4   2294    c   NA

I hope this should serve the purpose

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45