As part of my thesis, I am analyzing political parties' polarity. After receiving a datadump with Facebook messages in JSON, I parsed it into R. Unfortunately, one list-variable is nested:
I need to extract the $sentiment$polarity$score
out of the list within list within list.
Observations: 63,465
Variables: 5
$ description <chr> "'TEXT'" ...
$ parties <list> ["X", "X", "Y", ...
$ date <date> 2018-03-05, 2018-03-05...
$ title <chr> NA, NA...
$ sentiment <list> [[[0.2998967, "Positief"], ...
Using glimpse(df$sentiment)
shows:
$ :List of 2
..$ polarity :List of 2
.. ..$ score : num 0.15
.. ..$ description: chr "Neutraal"
..$ subjectivity:List of 2
.. ..$ score : num 0.65
.. ..$ description: chr "Erg subjectief"
[list output truncated]
EDIT: head(df$sentiment, n=1) gives:
[[1]]
[[1]]$`polarity`
[[1]]$`polarity`$`score`
[1] 0.2998967
[[1]]$`polarity`$description
[1] "Positief"
[[1]]$subjectivity
[[1]]$subjectivity$`score`
[1] 0.5458678
[[1]]$subjectivity$description
[1] "Subjectief"
But, the problematic part of df$sentiment
exists in (when running head(df$sentiment, n=10)
) is as follows:
[[5]]
named list()
Thus, the observation does contain an empty list, instead of the format of containing two other lists.
I have tried the following:
df %>% unnest(sentiment, .drop = FALSE, .sep = '"')
Unfortunately, this doubled my df thereby losing the distinction between polarity$score
and sentiment$score
.
Also, I tried
matrix(unlist(df$sentiment),ncol=4,byrow=TRUE)
Unfortunately, this cannot cope with the NULL entries (i.e. when $sentiment
is empty while $polarity
is not empty). Thus, it creates a flawed matrix.
I have also played around with the flatten
, unlist
and tranpose
functions, but that did not seem to get me anywhere. I am not that experienced in R, therefore I hoped someone could assist me to extract the right score and enter it as an column to my dataframe. I hope I provided all the needed information.