I am currently trying to run a fixed effects linear regression with the following model:
regression1 <- felm(average_score ~ giveaway + giveaway:factor(copytype) +
giveaway:winning_odds | movie_id + year_month ,
data = movielist)
However I keep getting the following error: *
Error in chol.default(mat[ok, ok]) : 'a' must have dims > 0
In addition: Warning message:
In chol.default(mat, pivot = TRUE, tol = tol) :
the matrix is either rank-deficient or indefinite
The movie_id is the variable that causes the error but I do not know how to fix it. It is a unique identifyer and I was told to use it as it accounts for differences between movies such as romance vs mystery etc.
the dataset looks somewhat like this:
> movie_id title average score giveaway copytype winning_odds year month
> 259764 Romeo & juliet 4.68 0 1 NA 1689-04
> 58697 James Bond 3.98 1 3 0.0036 2008-07
I tried to solve it by identiying where it came from and concluded it came from the movie_id variable, but unsure what it caused. I removed all the NA's and tried several solutions proposed online but it doesn't seem to work. Every row has an ID and (as expected) recoding the id into a factor also did not work..
Here is a reproducable example using dput():
structure(list(c(49207064L, 40610763L, 18342451L, 24611562L, 18112455L, 37954232L, 41559097L, 32982694L, 30078050L, 26205520L, 45212165L, 25257581L, 39838159L, 28146924L, 48977227L, 28152663L, 19358557L, 20295281L, 40011514L, 12407923L, 13631830L, 22889883L, 1538505L, 37823518L, 13614942L, 48919775L, 14553840L, 6098580L, 17868537L, 11787536L), c("True Blue Cowboy", "SEVENTEEN: 17", "Translations from Bark Beetle: Poems", "Hard Beat (Driven, #7)", "House of Steel (Unraveled, #1)", "Rx", "Ethic 2", "Black History In Its Own Words", "My Hope Next Door", "The Dragon's Egg", "The Memory Thief", "The Big Old House And The Scary Storm", "Hong Kong Noir", "The Messengers: Discovered", "Trouble", "Altered", "What Goes on Tour", "About Matters of the Hurt: Love Stories - Round the Clock", "The 21: A Journey Into the Land of Coptic Martyrs", "These Arms of Mine", "The Breast of Everything", "The Ghost Network", "Unconventional Flying Objects: A Scientific Analysis: A Scientific Analysis", "The Way of Life: Experiencing the Culture of Heaven on Earth", "Land", "Alan Cumming: Legal Immigrant", "I Could Pee On This: And Other Poems By Cats", "Heaven is Small", "The Wayang at Eight Milestone: Stories & Essays", "Turning Point"), c(2020L, 2018L, 2014L, 2015L, 2013L, 2018L, 2018L, 2017L, 2016L, 2015L, 2019L, 2014L, 2018L, 2016L, 2019L, 2016L, 2014L, 2013L, 2019L, 2012L, 2012L, 2015L, 1995L, 2018L, 2012L, 2019L, 2012L, 2009L, 2013L, 2011L), c(2L, 9L, 4L, 11L, 6L, 9L, 10L, 2L, 9L, 8L, 4L, 10L, 12L, 5L, 11L, 2L, 2L, 12L, 2L, 1L, 4L, 5L, 12L, 9L, 4L, 11L, 8L, 4L, 10L, 10L), c(4.27, 3.71, 4.09, 4.19, 3.48, 3.81, 4.74, 4.16, 4.36, 3.98, 4.08, 5, 3.8, 4.34, 3.72, 3.75, 3.62, 3.91, 4.09, 4.06, 4.67, 3.1, 4.04, 4.63, 3.89, 4.24, 3.96, 3.31, 3.97, 3.95), c(385L, 17L, 22L, 3887L, 132L, 817L, 1792L, 454L, 1139L, 1113L, 1154L, 4L, 127L, 108L, 446L, 49L, 1680L, 11L, 113L, 88L, 12L, 1359L, 34L, 224L, 2490L, 226L, 8182L, 140L, 30L, 51L), c(52L, 0L, 4L, 397L, 27L, 120L, 246L, 103L, 190L, 82L, 213L, 1L, 27L, 35L, 57L, 39L, 174L, 6L, 50L, 23L, 6L, 273L, 7L, 23L, 246L, 26L, 1243L, 31L, 3L, 17L ), c(2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 3L, 1L, 1L, 1L, 1L ), c("2020-2", "2018-9", "2014-04", "2015-11", "2013-06", "2018-09", "2018-10", "2017-02", "2016-9", "2015-8", "2019-4", "2014-10", "2018-12", "2016-05", "2019-11", "2016-02", "2014-2", "2013-12", "2019-02", "2012-01", "2012-04", "2015-05", "1995-12", "2018-9", "2012-4", "2019-11", "2012-08", "2009-4", "2013-10", "2011-10" ), c(0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L ), c(0, 0, 0.0173913043478261, 0.0114285714285714, 0.00134952766531714, 0.00715307582260372, 0, 0.00154798761609907, 0, 0, 0, 0.333333333333333, 0, 0.00201207243460765, 0, 0.00590841949778434, 0, 0.00284900284900285, 0.0109170305676856, 0.00323624595469256, 0.00627615062761506, 0.000873362445414847, 0.00354609929078014, 0, 0, 0, 0.00109289617486339, 0, 0.00148809523809524, 0.08110300081103)), row.names = c(NA, -30L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x00000191ca3c32b0>, names = c("movie_id", "title", "publication_year", "publication_month", "average_rating", "ratings_count", "text_reviews_count", "copytype", "year_month", "giveaway", "winning_odds"))