1

I've parsed a json file and now I want to see the mentions that every tweet has received.

Looking at the structure I see is in $entities$user_mentions but all I get there in NULL values. As I see in https://dev.twitter.com/overview/api/entities, there is also hashtag, media, url... and all them retrieve NULL values.

I've tried several functions but non of them have worked for me:

mentions<- subset(data, !is.null(data$entities))

mentions<- sapply(data, function(x) if (is.null(x$entities$user_mentions)) NA else  x$entities$user_mentions$id_str)

I'd appreciate some help on how can I manage this info Thank you

EDIT: So, after parsing with parse_stream() I want to work with the column mentions_user_id

dput(head(json_data2[, c(5,23:26)]))
structure(list(text = c("RT @SteveBlogs1: Diane Abbott when she heard the #GeneralElection announcement. ", 
"RT @DrHughHarvey: Worried that 18-24 yr olds will vote against you in your snap election? No probs, just hold it in the middle of exam seas…", 
"There goes @JonnElledge taking my inner monologue and publishing it in New @NewStatesman again.. ", 
"RT @Channel4News: Theresa May has repeatedly said that there would be no election until 2020. Today she “reluctantly” decided to call… ", 
"RT @charlottechurch: Theresa May is full of <f0><U+009F><U+0092><U+00A9><f0><U+009F><U+0092><U+00A9><f0><U+009F><U+0092><U+00A9>", 
"RT @Jim_Watford: Here comes the Labour battle bus. #GeneralElection "
), mentions_screen_name = c("SteveBlogs1", "DrHughHarvey", "JonnElledge NewStatesman", 
"Channel4News", "charlottechurch", "Jim_Watford"), mentions_user_id = c("2796723891", 
"2324111390", "21862223 19906615", "14569869", "167711000", "44941453"
), symbols = c(NA, NA, NA, NA, NA, NA), hashtags = c("GeneralElection", 
NA, NA, NA, NA, "GeneralElection")), .Names = c("text", "mentions_screen_name", 
"mentions_user_id", "symbols", "hashtags"), row.names = c(NA, 
6L), class = "data.frame")

The point now it's that I need to do a sapply for a later graph but It doesn't work because of the next error:

ment<- sapply(data, function(x) if (is.null(x$mentions_user_id)) NA else x$mentions_user_id)

Error: '$ operator is invalid for atomic vectors'

I've tried also: men_ids<- sapply(data2, function(x) if (is.null(x[mentions_user_id])) NA else x[mentions_user_id])

Error in FUN(X[[i]], ...) : object 'mentions_user_id' not found

And doesn't work either. I know there are several links about this but still...

It's a data frame.If I convert it to a list jdlist <- as.list(data2) then is.recursive(men_ids) gives TRUE which means it should works but again same error when sapply.

David Bale
  • 15
  • 5
  • Not sure if it's applicable to your work, but there's an `rtweet` package that might be helpful – yeedle Jun 06 '17 at 22:14
  • As yeedle already suggested, use the `rtweet` package to grab the data conveniently. It also contains the user_mentions and other entities. Although you might habe misunderstood, what it is: it's not the list of users who "mentioned" something, but basically the @screenname's addressed within a tweet. – lukeA Jun 06 '17 at 23:28
  • Thanks both for the quick answers. @lukeA, yes, I want the users mentioned in the text, in the tweet itself. So do you think the problem is when parsing? I've tried `rtweet` package and the function parse_stream() and it takes much more time than `rjson`. There is a column called 'mentions_user_id'. Do you think this is what I'm looking for? luckly there are ids. I tried this and reteive an error '$ operator is invalid for atomic vectors' but it's a data.frame... `ment<- sapply(data, function(x) if (is.null(x$mentions_user_id)) NA else x$mentions_user_id)` Much appreciated – David Bale Jun 07 '17 at 10:20

1 Answers1

0

Here's a quick example:

library(rtweet)
r <- search_tweets("#rstats")
r[c(3:5,8),c("mentions_screen_name", "text")]
#     mentions_screen_name                                                                                                                                         text
# 3               DataCamp      RT @DataCamp: Algorithmic Trading in R, tutorial &amp; tips here - U5KQzE8rJI #Algotrading #rstats PNxQ8OSquX
# 4                   <NA>                  Cheat Sheet of Machine Learning and Python (and Math) Cheat Sheets | ymj4bu8pCB | #rstats #python #datascience
# 5              Rbloggers                                             RT @Rbloggers: How to Install R Ubuntu 16.04 Xenial xNk51T5E6F #rstats #DataScience
# 8 DataScienceLA earlconf RT @DataScienceLA: Slides for my talk at @earlconf #EARLConf2017 are here xYDWKnR3tM #rstats #machinelearning U9k…
lukeA
  • 53,097
  • 5
  • 97
  • 100
  • Yes, thank you. So I parsed it because I work with a json file and I think it's fine. So, I think that answered my inicial question. The point now it's that I need to do a sapply for a later graph but It doesn't work because of the error that I mentioned before. – David Bale Jun 07 '17 at 11:21
  • @DavidBale Please edit your post and make it a reproducible example [here's a guide](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610), which one can copy-paste-run in R and recreate your error message. – lukeA Jun 07 '17 at 11:26
  • @DavidBale No problem & thanks for editing. I think you could try `sapply(json_data2$mentions_user_id, function(x) if (is.null(x)) NA else x)`? – lukeA Jun 07 '17 at 18:06