0

I do not know if my title for the question makes sense. I am trying to write a code that replace the count with the name of the columns. For example if the count of an observation has 2, the Id of that observation becomes 2 with the name of the column instead of the count. The tables below shows what I want if my explanation does not make sense.

This is my table (code):

df <- structure(list(ID = c("P40", "P41", "P43"), 
                     Fruit = c(2, 2, 1),
                     Snack = c(2, 1, 1)),
                class = "data.frame", row.names = c(NA, -3L))

Table:

ID    Fruit Snack
P40     2     2
P41     2     1
P43     1     1

This is what i want to achieve:
 ID  Items 
P40  Fruit    
P40  Fruit
P40  Snack
P40  Snack
P41  Fruit
P41  Fruit
P41  Snack
P43  Fruit         
P43  Snack
Sotos
  • 51,121
  • 6
  • 32
  • 66
ranaz
  • 97
  • 1
  • 10

3 Answers3

3

One option is to gather and uncount

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -ID) %>%
  uncount(value)

#     ID   key
#1   P40 Fruit
#1.1 P40 Fruit
#2   P41 Fruit
#2.1 P41 Fruit
#3   P43 Fruit
#4   P40 Snack
#4.1 P40 Snack
#5   P41 Snack
#6   P43 Snack
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • @ Ronak, this work perfectly on the sample (example) but when i use on a different dataset it gives me an error. Error in rep(seq_nrow(data), w) : invalid 'times' argument – ranaz Jul 26 '19 at 11:58
  • @ranaz How many columns do you have? Are they all numeric except `ID` ? – Ronak Shah Jul 26 '19 at 12:36
0

We can do this in base R by unlisting the columns except the first one and then replicating the sequence of rows with the values of the column to expand the data

df1 <- data.frame(ID = df[,1], Items = unlist(df[-1], use.names = FALSE))
df1[rep(seq_len(nrow(df1)), df1$Items),]
#.    ID Items
#1   P40     2
#1.1 P40     2
#2   P41     2
#2.1 P41     2
#3   P43     1
#4   P40     2
#4.1 P40     2
#5   P41     1
#6   P43     1
akrun
  • 874,273
  • 37
  • 540
  • 662
0

a one-liner:

library(reshape2)

dd <- data.frame(ID = rep(melt(df)$ID, melt(df)$value),
           Items = rep(melt(df)$variable,melt(df)$value)
           )
Vitali Avagyan
  • 1,193
  • 1
  • 7
  • 17