0

I have a dataframe, the head of which looks like this:

|trackName            | week| sum|
|:--------------------|----:|---:|
|New Slang            |    1| 493|
|You're Somebody Else |    1| 300|
|Mushaboom            |    1| 297|
|San Luis             |    1| 296|

I am interested in plotting a line graph for each of the 346 unique trackNames in the dataframe, with week on the x-axis and sum on the y-axis. To automate this process, I wrote the following function:

charts <- function(df) {
  songs <- df
  lim <- nrow(songs)
  x <- 1
  song_names <- as_tibble(unique(songs$trackName))
  while (x <= lim) {
    song <- song_names[x, 1]
    plot.name <- paste(paste(song), "plot.png", sep = "_")
    songs %>% filter(trackName == paste(song[x, 1])) %>%
      ggplot(., aes(x = week, y = sum), group = 1) +
      geom_line() +
      labs(
        x = "Week",
        y = "Sum of Listens",
        title = paste("Week by Week Listening Interest for", song, sep = " "),
        subtitle = "Calculated by plotting the sum of percentages of the song listened per week, starting from first listen"
      ) +
      ggsave(plot.name,
             width = 20,
             height = 15,
             units = "cm")
    x <- x + 1
  }
}

However when I run charts(df), only the following error shows up and then it quits:

> charts(mini)
geom_path: Each group consists of only one observation. Do you need to
adjust the group aesthetic?
> 

What am I doing wrong here and what does this error mean?

A sample of the dataframe in DPUT format:

structure(list(trackName = c("New Slang", "You're Somebody Else", 
"Mushaboom", "San Luis", "The Trapeze Swinger", "Flightless Bird, American Mouth", 
"tere bina - Acoustic", "Only for a Moment", "Upward Over the Mountain", 
"Virginia May", "Never to Be Forgotten Kinda Year", "Little Talks", 
"Jhak Maar Ke", "Big Rock Candy Mountain", "Sofia", "Aaoge Tum Kabhi", 
"Deathcab", "Dil Mere", "Choke", "Phir Le Aya Dil", "Lucille", 
"tere bina - Acoustic", "Dil Mere", "Only for a Moment", "This Is The Life", 
"San Luis", "Main Bola Hey!", "Choo Lo", "Yeh Zindagi Hai", "Aaftaab", 
"Never to Be Forgotten Kinda Year", "Khudi", "Flightless Bird, American Mouth", 
"Mere Bina", "Simple Song", "Dil Haare", "Dil Hi Toh Hai", "You're Somebody Else", 
"Sofia", "Who's Laughing Now", "Main Bola Hey!", "Lucille", "Eenie Meenie", 
"tere bina - Acoustic", "New Slang", "Aaftaab", "Mamma Mia", 
"July", "Yeh Zindagi Hai", "Someone You Loved"), week = c(1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3), sum = c(493, 300, 297, 296, 292, 234, 214, 
200, 200, 197, 192, 187, 185, 181, 175, 172, 141, 119, 106, 103, 
579, 574, 501, 462, 428, 378, 320, 307, 306, 301, 301, 300, 300, 
300, 300, 300, 296, 294, 251, 242, 3534, 724, 696, 512, 479, 
400, 302, 300, 300, 300)), row.names = c(NA, -50L), class = c("tbl_df", 
"tbl", "data.frame"))
Aman
  • 387
  • 8
  • 33
  • 1
    What were you intending with `group = 1`? And can you explain what you mean or intend to produce by "a line graph"? – IRTFM May 14 '21 at 03:40
  • @IRTFM It is based on [this answer](https://stackoverflow.com/a/29019102/14425671), which discusses the same error. I would just like to be able to plot the `week` and `sum` using the `geom_line()` function for each of the `trackName`s in the dataframe. – Aman May 14 '21 at 03:46
  • @IanCampbell Sorry, my bad! – Aman May 14 '21 at 03:46
  • See, now you're just making me look silly. =P – Ian Campbell May 14 '21 at 03:57
  • 1
    @IanCampbell Aye no such intention! I realize it was bad practice to include more dependencies than required. – Aman May 14 '21 at 04:02

2 Answers2

1

How about using purrr::walk instead?

library(tidyverse)
library(hrbrthemes)
walk(unique(songs$trackName),
     ~{ggsave(plot = ggplot(filter(songs, trackName == .x), aes(x = week, y = sum), group = 1) +
                       geom_line(color = ft_cols$yellow) +
                       labs(x = "Week", y = "Sum of Listens", title = paste("Week by Week Listening Interest for", .x, sep = " "),
                       subtitle = "Calculated by plotting the sum of percentages of the song listened per week, starting from first listen") +
                       theme_ft_rc(),
             file = paste0(.x,"_plot.png"), width = 20, height = 15, units = "cm")})

enter image description here

Note: the question was subsequently edited to remove the hrbrthemes package requirement.

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • 1
    This works fantastically! I definitely need to learn the `purrr` package better. Thank you so much. – Aman May 14 '21 at 04:01
1

You can split the dataset for each trackName and create a png file for it.

library(tidyverse)

charts <- function(df) {
  df %>%
    group_split(trackName) %>%
    map(~{
      track <- first(.x$trackName)
      ggplot(.x, aes(x = factor(week), y = sum, group = 1)) +
        geom_line() +
        labs(
          x = "Week",
          y = "Sum of Listens",
          title = paste("Week by Week Listening Interest for", track),
          subtitle = "Calculated by plotting the sum of percentages of the song listened per week, starting from first listen"
        ) -> plt
      
      ggsave(paste0(track,'.png'), plt, width = 20, height = 15, units = "cm")
    })
}

charts(songs) 
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you for your answer! This makes sense, I was not aware of the `group_split()` function. I shall look into this and hopefully use it the next time :) – Aman May 14 '21 at 04:03