0

I'm trying to merge / combine data frames within mapply-function. Background: Used data set: data frame with movies, there is one column called genres. This column has "|" - separated genres for every movie-id, e.g. "Horror|Action|Fantasy".

I want to generate a data frame which stores for every movie every genre id in a new row so I can do statics for every genre, e.g.

id  genre
42  Horror
42  Action
42  Fantasy
43  Action

...

After this I could join the movies-dataframe with this generated data frame by movie-id.

Here is what I'm trying:

moviegenres <- data.table(id=integer(), genre=character())


genres <- mapply(function(id, m){
    g <- unlist(str_split(m, "\\|"))
    df <- data.table(id=id, genre=g)
    rbind(df)
},movies$id, movies$genres)

I tried it with merge-function as well. If I put a print in the function I can see the correct generated data tables for every movie. But after running this code the data table (or data frame) moviegenres is empty...!

Thank you! Wolfgang

YOLO
  • 20,181
  • 5
  • 20
  • 40

1 Answers1

2

I think you are need separate_rows

library(tidyverse)

df <- data.table(id = c(1,2), genre = c( "Horror|Action|Fantasy",  "Horror|Action|Fantasy"))

df %>% 
    separate_rows(genre, sep = "\\|")

   id   genre
1:  1  Horror
2:  1  Action
3:  1 Fantasy
4:  2  Horror
5:  2  Action
6:  2 Fantasy
YOLO
  • 20,181
  • 5
  • 20
  • 40