12

So im actually working on twitteR and i need a way to store my tweets into a csv file and pull it out when i need it . This is due to the idea i want to compile the tweets i collect and then apply them to my algorithms to do the calculations later. So , i thought of trying

            write.csv(tweets, file = "newfile", row.names = TRUE, sep = ',', col.names = TRUE)

which only works if create a data frame tho :/ . The tweets that i collect looks like this

 [[1]]
 [1] "anonymous: boring!"

 [[2]]
 [1] "anonymous: random message !"

.... ......

Any ideas?

Edited: my str(tweets) this is just 3 tweets i just pulled out

List of 3
 $ :Reference class 'status' [package "twitteR"] with 17 fields
  ..$ text         : chr "damn so many thing to settle @@"
  ..$ favorited    : logi FALSE
  ..$ favoriteCount: num 0
  ..$ replyToSN    : chr(0) 
  ..$ created      : POSIXct[1:1], format: "2013-10-11 14:15:59"
  ..$ truncated    : logi FALSE
  ..$ replyToSID   : chr(0) 
  ..$ id           : chr "388669309028798464"
  ..$ replyToUID   : chr(0) 
  ..$ statusSource : chr "web"
  ..$ screenName   : chr "ThisIsNapmi"
  ..$ retweetCount : num 0
  ..$ isRetweet    : logi FALSE
  ..$ retweeted    : logi FALSE
  ..$ longitude    : chr(0) 
  ..$ latitude     : chr(0) 
  ..$ urls         :'data.frame':   0 obs. of  4 variables:
  .. ..$ url         : chr(0) 
  .. ..$ expanded_url: chr(0) 
  .. ..$ dispaly_url : chr(0) 
  .. ..$ indices     : num(0) 
  ..and 50 methods, of which 38 are possibly relevant:
  ..  getCreated, getFavoriteCount, getFavorited, getId, getIsRetweet, getLatitude,
  ..  getLongitude, getReplyToSID, getReplyToSN, getReplyToUID, getRetweetCount, getRetweeted,
  ..  getRetweets, getScreenName, getStatusSource, getText, getTruncated, getUrls, initialize,
  ..  setCreated, setFavoriteCount, setFavorited, setId, setIsRetweet, setLatitude,
  ..  setLongitude, setReplyToSID, setReplyToSN, setReplyToUID, setRetweetCount, setRetweeted,
  ..  setScreenName, setStatusSource, setText, setTruncated, setUrls, toDataFrame,
  ..  toDataFrame#twitterObj
 $ :Reference class 'status' [package "twitteR"] with 17 fields
  ..$ text         : chr "@Neverush @asmafab http://t.co/TOakKW4kyc"
  ..$ favorited    : logi FALSE
  ..$ favoriteCount: num 0
  ..$ replyToSN    : chr "Neverush"
  ..$ created      : POSIXct[1:1], format: "2013-10-11 12:55:04"
  ..$ truncated    : logi FALSE
  ..$ replyToSID   : chr "388647414808051712"
  ..$ id           : chr "388648948111392770"
  ..$ replyToUID   : chr "44332730"
  ..$ statusSource : chr "web"
  ..$ screenName   : chr "ThisIsNapmi"
  ..$ retweetCount : num 0
  ..$ isRetweet    : logi FALSE
  ..$ retweeted    : logi FALSE
  ..$ longitude    : chr(0) 
  ..$ latitude     : chr(0) 
  ..$ urls         :'data.frame':   1 obs. of  5 variables:
  .. ..$ url         : chr "http://t.co/TOakKW4kyc"
  .. ..$ expanded_url: chr "http://www.youtube.com/watch?v=2mjvfnUAfyo"
  .. ..$ display_url : chr "youtube.com/watch?v=2mjvfn…""| __truncated__
  .. ..$ start_index : num 19
  .. ..$ stop_index  : num 41
  ..and 50 methods, of which 38 are possibly relevant:
  ..  getCreated, getFavoriteCount, getFavorited, getId, getIsRetweet, getLatitude,
  ..  getLongitude, getReplyToSID, getReplyToSN, getReplyToUID, getRetweetCount, getRetweeted,
  ..  getRetweets, getScreenName, getStatusSource, getText, getTruncated, getUrls, initialize,
  ..  setCreated, setFavoriteCount, setFavorited, setId, setIsRetweet, setLatitude,
  ..  setLongitude, setReplyToSID, setReplyToSN, setReplyToUID, setRetweetCount, setRetweeted,
  ..  setScreenName, setStatusSource, setText, setTruncated, setUrls, toDataFrame,
  ..  toDataFrame#twitterObj
 $ :Reference class 'status' [package "twitteR"] with 17 fields
  ..$ text         : chr "@Neverush @asmafab nasi lemak bumbung ? ahahahaha"
  ..$ favorited    : logi FALSE
  ..$ favoriteCount: num 0
  ..$ replyToSN    : chr "Neverush"
  ..$ created      : POSIXct[1:1], format: "2013-10-11 12:34:39"
  ..$ truncated    : logi FALSE
  ..$ replyToSID   : chr "388643321108631552"
  ..$ id           : chr "388643810613264384"
  ..$ replyToUID   : chr "44332730"
  ..$ statusSource : chr "web"
  ..$ screenName   : chr "ThisIsNapmi"
  ..$ retweetCount : num 0
  ..$ isRetweet    : logi FALSE
  ..$ retweeted    : logi FALSE
  ..$ longitude    : chr(0) 
  ..$ latitude     : chr(0) 
  ..$ urls         :'data.frame':   0 obs. of  4 variables:
  .. ..$ url         : chr(0) 
  .. ..$ expanded_url: chr(0) 
  .. ..$ dispaly_url : chr(0) 
  .. ..$ indices     : num(0) 
  ..and 50 methods, of which 38 are possibly relevant:
  ..  getCreated, getFavoriteCount, getFavorited, getId, getIsRetweet, getLatitude,
  ..  getLongitude, getReplyToSID, getReplyToSN, getReplyToUID, getRetweetCount, getRetweeted,
  ..  getRetweets, getScreenName, getStatusSource, getText, getTruncated, getUrls, initialize,
  ..  setCreated, setFavoriteCount, setFavorited, setId, setIsRetweet, setLatitude,
  ..  setLongitude, setReplyToSID, setReplyToSN, setReplyToUID, setRetweetCount, setRetweeted,
  ..  setScreenName, setStatusSource, setText, setTruncated, setUrls, toDataFrame,
  ..  toDataFrame#twitterObj
Napmi
  • 521
  • 2
  • 13
  • 32
  • Why does it need to be a CSV file? Why not just something like `writeLines(unlist(tweets), "newfile.txt")`? – A5C1D2H2I1M1N2O1R2T1 Oct 12 '13 at 05:42
  • @Ananda Mahto just tried using that method and this is what i get Error in writeLines(unlist(tweets), "newfile.txt") : invalid 'text' argument Not really sure if my tweets that i collect are really lists or what – Napmi Oct 12 '13 at 05:46
  • 1
    If you are not going to use the file with any other program than R, then I would highly recommend `saveRDS` and `loadRDS`. That is, save the list in its existing R format rather than convert it into a CSV format. Also, reading and writing will likely be faster. – flodel Oct 12 '13 at 11:17
  • Oh ic, thanks at least i know about this function which i will try too. However, can it actually concatenate with other saveRDS files ? Cause my whole purpose is actually to compress tweets that i am going to collect. – Napmi Oct 12 '13 at 15:12

3 Answers3

11

Not tested, but from what I've read online, it seems like the following should work:

  1. Convert the list to a data.frame

    library(plyr) 
    tweets.df = ldply(tweets, function(t) t$toDataFrame())
    
  2. Use write.csv as before, but just on the tweets.df object instead of the tweets object.

    write.csv(tweets.df, file = "newfile.csv")
    

Sources: Here and here. See also: ?"status-class".

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • i just found out i can actually use twListToDF(tweets) to convert the tweets to a table form. But i'll try your method too , thanks ! – Napmi Oct 12 '13 at 06:36
  • 4
    @EricHeng, I just found that too ([here](http://rfunction.com/archives/2002)), and was about to update. I would suggest that since that is part of the package, it would probably be the way to go. – A5C1D2H2I1M1N2O1R2T1 Oct 12 '13 at 06:38
  • But what if the columns don't match ie some entries has more columns than other? – Ole Petersen Aug 17 '16 at 09:29
11

You can use the following to convert tweets into tweets dataframe:

tweets.df <- do.call("rbind", lapply(tweets, as.data.frame)) 

Then use tweets.df in your write.csv function.

Ali Cirik
  • 1,475
  • 13
  • 21
0

using twitteR package:

convert your tweets to data frame

tweets2df <- twListToDF(tweets)

then save it to csv

write.csv(tweets2df, file = "tweets.csv")
montxe
  • 1,529
  • 1
  • 15
  • 8