1

I have issues with transforming my data frame from wide to long. I'm well aware that there are plenty of excellent vignettes out there, which explain gather() or pivot_longer() very precisely (e.g. https://www.storybench.org/pivoting-data-from-columns-to-rows-and-back-in-the-tidyverse/). Nevertheless, I'm still stuck for days now and this drives me crazy. Thus, I dediced to ask the internet. You.

I have a data frame that looks like this:

id     <- c(1,2,3)
year   <- c(2018,2003,2011)
lvl    <- c("A","B","C")
item.1 <- factor(c("A","A","C"),levels = lvl)
item.2 <- factor(c("C","B","A"),levels = lvl)
item.3 <- factor(c("B","B","C"),levels = lvl)
df     <- data.frame(id,year,item.1,item.2,item.3)

So we have an id variable for each observation (e.g. movies). We have a year variable, indicating when the observation took place (e.g. when the movie was released). And we have three factor variables that assessed different characteristics of the observation (e.g. cast, storyline and film music). Those three factor variables share the same factor levels "A","B" or "C" (e.g. cast of the movie was "excellent", "okay" or "shitty").

But in my wildest dreams, the data more look like this:

id.II     <- c(rep(1, 9), rep(2, 9), rep(3,9))
year.II   <- c(rep(2018, 9), rep(2003, 9), rep(2011,9))
item.II   <- rep(c(c(1,1,1),c(2,2,2),c(3,3,3)),3)
rating.II <- rep(c("A", "B", "C"), 9)
number.II  <- c(1,0,0,0,0,1,0,1,0,1,0,0,0,1,0,0,1,0,0,0,1,1,0,0,0,0,1)
df.II     <- data.frame(id.II,year.II,item.II,rating.II,number.II)

So now the data frame would be way more useable for further analysis. For example, the next step would be to calculate for each year the number (or even better percentage) of movies that were rated as "excellent".

year.III   <- factor(c(rep(2018, 3), rep(2003, 3), rep(2011,3)))
item.III   <- factor(rep(c(1, 2, 3), 3))
number.A.III <- c(1,0,0,1,0,0,0,1,0)
df.III     <- data.frame(year.III,item.III,number.A.III)

ggplot(data=df.III, aes(x=year.III, y=number.A.III, group=item.III)) +
  geom_line(aes(color=item.III))+
  geom_point(aes(color=item.III))+
  theme(panel.background = element_blank(),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        legend.position = "bottom")+
  labs(colour="Item")

Or even more important to me, show for each item (cast, storytelling, film music) the percentage of being rated as "excellent", "okay" and "shitty".

item.IV   <- factor(rep(c(c(1,1,1),c(2,2,2),c(3,3,3)),3))
rating.IV <- factor(rep(c("A", "B", "C"), 9))
number.IV <- c(2,0,1,1,1,1,0,2,1)
df.IV     <- data.frame(item.IV,rating.IV,number.IV)
df.IV

ggplot(df.IV,aes(fill=rating.IV,y=number.IV,x=item.IV))+
  geom_bar(position= position_fill(reverse = TRUE), stat="identity")+
  theme(axis.title.y = element_text(size = rel(1.2), angle = 0),
        axis.title.x = element_blank(),
        panel.background = element_blank(),
        legend.title = element_blank(),
        legend.position = "bottom")+
  labs(x = "Item")+
  coord_flip()+
  scale_x_discrete(limits = rev(levels(df.IV$item.IV)))+
  scale_y_continuous(labels = scales::percent)

My primary question is: How do I transform the data frame df into df.II? That would make my day. Wrong. My weekend.

And if you could then also give a hint how to proceed from df.II to df.III and df.IV that would be absolutely mindblowing. However, I don't want to burden you too much with my problems.

Best wishes Jascha

Jascha
  • 21
  • 1
  • Have you looked at [Reshaping multiple sets of measurement columns (wide format) into single columns (long format)](https://stackoverflow.com/q/12466493/4752675) ? – G5W Feb 05 '21 at 15:46

1 Answers1

0

Does this achieve what you need?

library(tidyverse)

df_long <- df %>%
  pivot_longer(cols = item.1:item.3, names_to = "item", values_to = "rating") %>%
  mutate(
    item = str_remove(item, "item.")
  )


df2 <- crossing(
  df_long,
  rating_all = unique(df_long$rating)
) %>%
  mutate(n = rating_all == rating) %>%
  group_by(id, year, item, rating_all) %>%
  summarise(n = sum(n))

df3 <- df2 %>%
  filter(item == "3")
Jakub.Novotny
  • 2,912
  • 2
  • 6
  • 21
  • Dear Jakup, many thanks for the quick and helpfull response. Yes, that helped me a lot and safed quite some time! From here on I can take the next steps. Cheers! – Jascha Feb 08 '21 at 09:54
  • Happy to help. Feel free to upvote/mark the answer as accepted. – Jakub.Novotny Feb 08 '21 at 10:45