copy rows based on the number of a variables (R)

Question

So, I have a dataset with 2 columns X and Y. Y is an integer between 0 and 5. I need to change the level of the detail of the dataset.

I want to copy the rows the number of times Y indicates As an example

X | Y
______
a | 1
b | 0
c | 2

Becomes

X | 
___
a | 
c | 
c |

a remains once, b disappears and c appears now twice. I do not need to keep the Y number, except in the number of rows of X.

My first thought was to do

df4 <- df  %>% filter (Y=4) 
df4 <- rbind(df4, df4, df4, df4)   %>% select (-Y)

but that all seems ugly, and it is not generalizable to Y =20 as an example.

Thank you!

score 3 · Accepted Answer · answered Mar 25 '21 at 18:33

3

We could use uncount

library(dplyr)
library(tidyr)
df %>%
   uncount(Y) %>%
   as_tibble

-output

# A tibble: 3 x 1
#  X    
#  <chr>
#1 a    
#2 c    
#3 c

or in base R with rep

df[rep(seq_len(nrow(df)), df$Y),'X', drop = FALSE]

df <- data.frame(X = c('a', 'b', 'c'), Y = c(1, 0, 2))

answered Mar 25 '21 at 18:33

akrun

1

uncount was the perfect function for the job. Thanks! – Neoleogeo Mar 25 '21 at 19:50

PKumar · Answer 2 · 2021-03-25T18:50:17.663

2

May be this:

df <- data.frame( 'x' = c('a', 'b', 'c'), 'y'= c(1, 0, 2))
rep(df$x, df$y)
or 
## For a dataframe:
df[match(rep(df$x, df$y), df$x),'x', drop=FALSE]

Output:

R>rep(df$x, df$y)
[1] "a" "c" "c"

edited Mar 25 '21 at 18:50

answered Mar 25 '21 at 18:35

PKumar

score 2 · Answer 3 · answered Mar 25 '21 at 20:00

2

What about this?

data.frame(
  X = with(
    df,
    rep(X, Y)
  )
)

which gives

  X
1 a
2 c
3 c

answered Mar 25 '21 at 20:00

ThomasIsCoding

3 Answers3