1

I am searching for a solution how to transform the following data frame using dplyr:

item <- c('A','B','C')
one <- c(2, 1, 2)
two <- c(1,1,2)
data <- data.frame(item,one,two)
data
item one two
A 2 1
B 1 1
C 2 2

Now, the column "one" contains the number of ratings of the value 1, the column "two" the number of ratings of the value 2. My ideal data frame after transformation would look like this:

item rating
A 1
A 1
A 2
B 1
B 2
C 1
C 1
C 2
C 2

Any idea how I could get to this output (it doesn't have to be dplyr)? I know how to use pivot_longer of the tidyr package but that doesn't solve the problem of repeating the number of rows...

Tom
  • 319
  • 1
  • 3
  • 11

2 Answers2

2
library(dplyr)
library(tidyr) # pivot_longer
nums <- c(one = 1, two = 2, three = 3)
data %>%
  pivot_longer(-item) %>%
  group_by(item) %>%
  summarize(rating = rep(name, times = value)) %>%
  ungroup() %>%
  mutate(rating = nums[rating])
# # A tibble: 9 x 2
#   item  rating
#   <chr>  <dbl>
# 1 A          1
# 2 A          1
# 3 A          2
# 4 B          1
# 5 B          2
# 6 C          1
# 7 C          1
# 8 C          2
# 9 C          2

I had to define nums because I couldn't find (in my haste) an easy way to convert "one" to 1 in a programmatic way. You'll need to make sure it goes out at least as far as you need; I added three=3 for demonstration, if you truly only have one and two then you should be good as-is.

(Related to that topic: Convert written number to number in R)

r2evans
  • 141,215
  • 6
  • 77
  • 149
0

Maybe you could convert it from wide to long format with the gather() function and then replace the string values of "one" and "two" by integers

library(tidyverse)
item <- c('A','B','C')
one <- c(2, 1, 2)
two <- c(1,1,2)
data <- data.frame(item,one,two)
long_df <- gather(data, rating, count, one:two)
new_df <- tibble()

for (i in range(nrow(data))) {
  new_df <- rbind(new_df, do.call("rbind", replicate(long_df[i, "count"], long_df, simplify = FALSE)))
}

new_df <- new_df %>% select(-c("count"))
intedgar
  • 631
  • 1
  • 11
  • 1
    (1) `gather`/`spread` are soft-deprecated, replaced by `pivot_longer`/`*_wider`, much more powerful for their intended use. (2) Iteratively `rbind`ing rows to a new frame works in concept but scales horribly and is discouraged in principle. See the [The R Inferno](https://www.burns-stat.com/pages/Tutor/R_inferno.pdf), chapter 2 on "Growing Objects". – r2evans Nov 02 '21 at 21:35
  • 1
    Thx, for the update! – intedgar Nov 02 '21 at 21:38