2

I know from this answer how to duplicate rows of a dataframe. That's fine if you want to repeat the rows n times.

I want to do something similar but to also add a prefix within a column of newly added rows. Note: I am not basing the repetition count on the number of rows. It needs to be provided by a parameter (call it k).

So the operation should look like this assuming I want to repeat the dataframe k=3 times:

Input:

data.frame(a = c("1","2","3", "4"),b = c(1,2,3, 4))
  a b
1 "1" 1
2 "2" 2
3 "3" 3
4 "4" 4

Output:

  a b
1 "1" 1
2 "2" 2
3 "3" 3
4 "4" 4
5 "1_1" 1
6 "1_2" 2
7 "1_3" 3
8 "1_4" 4
9 "2_1" 1
10 "2_2" 2
11 "2_3" 3
12 "2_4" 4

What's a nice R way to do this??

Union find
  • 7,759
  • 13
  • 60
  • 111
  • Should row 9 be 2_3? I take it you don't want 3_1,3_2,3_3? – pluke Dec 29 '22 at 22:56
  • df %>% mutate(n = N) %>% uncount(n) %>% group_by(a) %>% mutate(n = row_number() -1) %>% ungroup() %>% mutate(a = paste0(a,"_",n) << no idea if this works as doing this on my phone. It might give you some ideas though – pluke Dec 29 '22 at 23:04
  • @pluke I updated the question.. there was confusion – Union find Dec 30 '22 at 05:30

2 Answers2

4

You could use expand_grid (assuming your data.frame is named df1):

library(dplyr)
library(tidyr)

expand_grid(a = df1$a, b = df1$b) %>% 
  mutate(a = paste(a, b, sep = "_")) %>% 
  bind_rows(df1, .)

This returns

     a b
1    1 1
2    2 2
3    3 3
4  1_1 1
5  1_2 2
6  1_3 3
7  2_1 1
8  2_2 2
9  2_3 3
10 3_1 1
11 3_2 2
12 3_3 3
Martin Gal
  • 16,640
  • 5
  • 21
  • 39
2

Using tidyverse with crossing

library(tidyr)
library(dplyr)
 data.frame(a = c("1","2","3"),b = c(1,2,3)) %>%
   add_row(crossing(!!! .) %>%
   unite(a, a, b, remove = FALSE))

-output

     a b
1    1 1
2    2 2
3    3 3
4  1_1 1
5  1_2 2
6  1_3 3
7  2_1 1
8  2_2 2
9  2_3 3
10 3_1 1
11 3_2 2
12 3_3 3

With the updated dataset and criteria

library(purrr) # v 1.0.0
library(stringr)
k <- 3
data.frame(a = c("1","2","3", "4"),b = c(1,2,3, 4)) %>%
   replicate(k, ., simplify = FALSE) %>% 
   setNames(seq_len(k) - 1) %>% 
   imap(~ .x %>%
     mutate(a = if(.y == 0) as.character(a) else str_c(.y, '_', a))) %>% 
   list_rbind

-output

     a b
1    1 1
2    2 2
3    3 3
4    4 4
5  1_1 1
6  1_2 2
7  1_3 3
8  1_4 4
9  2_1 1
10 2_2 2
11 2_3 3
12 2_4 4

Or with slice and make.unique

data.frame(a = c("1","2","3", "4"),b = c(1,2,3, 4)) %>% 
   slice(rep(row_number(), times = k)) %>% 
   mutate(a = str_replace(make.unique(a, sep = "_"), 
        "^(\\d+)_(\\d+)", "\\2_\\1"))

-output

     a b
1    1 1
2    2 2
3    3 3
4    4 4
5  1_1 1
6  1_2 2
7  1_3 3
8  1_4 4
9  2_1 1
10 2_2 2
11 2_3 3
12 2_4 4
akrun
  • 874,273
  • 37
  • 540
  • 662