0

I have a dataset called songs.merge with variables songID, songName, year, artist, score1, score2 and score.diff

I want to aggregate(sum) the score.diff according to the artist, count the number of songs per artist, and then remove any artists with fewer than 4 songs in the dataframe.

I am trying to use dplyr in r but it is not working. How should I proceed?

songs.merge %>% 
   group_by(artist) %>% 
      summarise_at(vars(diff),funs(sum(.,na.rm=TRUE)),
                   vars(songID), funs(count()))
Masoud Keshavarz
  • 2,166
  • 9
  • 36
  • 48
  • 1
    Welcome to SO, user18758152! Questions on SO (especially in R) do much better if they are reproducible and self-contained. By that I mean including attempted code (please be explicit about non-base packages), sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Apr 09 '22 at 20:29

1 Answers1

1

suggestion:

songs.merge %>% 
   group_by(artist) %>% 
   summarise(sum_diff = sum(score.diff, na.rm = TRUE),
             song_count = n()) %>%
   filter(song_count > 3)