I am using Rfacebook to extract some content from Facebook's API through R. I somehow get sometimes posts two or three times back even though they only appear 1 time in facebook. Probably some problem with my crawler. I extracted already a lot of data and don't want to rerun the crawling. So I was thinking about cleaning the data I have.
Is there any handy way with dplyr to do that?
The data I got looks like the following:
Name message created_time id
Sam Hello World 2013-03-09T19:52:22+0000 26937808
Nicky Hello Sam 2013-03-09T19:53:16+0000 26930800
Nicky Hello Sam 2013-03-09T19:53:16+0000 26930800
Nicky Hello Sam 2013-03-09T19:53:16+0000 26930800
Sam Whats Up? 2013-03-09T19:53:22+0000 26937806
Sam Whats Up? 2013-03-09T19:53:22+0000 26937806
Florence Hi guys! 2013-03-09T19:55:16+0000 25688232
Steff How r u? 2013-03-09T19:59:16+0000 64552194
I would now like to have a new data frame in which every post only appears one time so that the three "double" posts from Nicky will be reduced to only one and the two double posts from Sam also get reduced to one post.
Any idea or suggestion how to do this in R? It seems like facebook is giving unique ids to posts and comments as well as that the time stamps are almost unique in my data. Both would be working for identification. However, it remains unclear to me how to best do the transformation...
Any help with this is highly appreciated!
Thanks!