Decoding unicode in tweets in R

Asked Dec 16 '18 at 16:23

Active Dec 16 '18 at 16:56

Viewed 183 times

In my tweets, I am getting Unicode in angular brackets as: "U+0001F602" Loved my flip phones

I want unicode in the format: \U0001F602. I used rtweet package for retrieving tweets. I am new in this area. I want to know can we filter out Retweets somehow to reduce redundancy in the dataset

tweets = search_tweets(q="phones", n=5000, lang="en")
Searching for tweets...
Finished collecting tweets!
write_as_csv(tweets, filename ,prepend_ids=TRUE, na="",fileEncoding = "UTF-8")

I tried without "fileencoding" parameter as well.

edited Dec 16 '18 at 16:56

asked Dec 16 '18 at 16:23

user1992989

Please take a look at [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), to modify your question, with a smaller sample taken from your data (check?dput()). Posting images of your data or no data makes it difficult to impossible for us to help you! – massisenergy Dec 16 '18 at 16:39
Modified. Thanks – user1992989 Dec 16 '18 at 17:00
It is not clear what you want. Do you want to handle unicode or get rid of rtweets? – Henry Cyranka Dec 16 '18 at 17:56
Main task is to handle unicode. – user1992989 Dec 16 '18 at 18:12
By searching for "Loved my flip phones", I was able to find a tweet with the character `U+0001F602`. The text part of the result is marked as `"UTF-8"` and contains the correct character `U+0001F602`. R in the Windows text console prints that with angle brackets, but RGui and RStudio as `"\U0001f602"`. Maybe this is just a printing (non-)issue. – mvkorpel Dec 20 '18 at 12:14

Decoding unicode in tweets in R

0 Answers0