1

I am using this approach to get the comments on page data.Its working fine,but I need to dump the data into MongoDB. Using this approach data is inserted but as a single document.I want to store that every comment should have a separate document with the information I am getting from the API.

from facepy import GraphAPI
import json
import pymongo
import json
connection = pymongo.MongoClient("mongodb://localhost")

facebook = connection.facebook
commen = facebook.comments
access = ''
#message
graph = GraphAPI(access)
page_id= 'micromaxinfo'
datas= graph.get(page_id+'/posts?fields=comments,created_time', page=True, retry=5)

posts=[]


for data in datas:
    print data

    commen.insert(data)
    break

Output Stored in MongoDB:

{
            "created_time" : "2015-11-04T08:04:14+0000",
            "id" : "120735417936636_1090909150919253",
            "comments" : {
                "paging" : {
                    "cursors" : {
                        "after" : "WTI5dGJXVnVkRjlqZFhKemIzSTZNVEE1TVRReE5ESTVOelV6TlRRd05Ub3hORFEyTnpFNU5UTTU=",
                        "before" : "WTI5dGJXVnVkRjlqZFhKemIzSTZNVEE1TURrd09UVTRNRGt4T1RJeE1Eb3hORFEyTmpJME16Z3g="
                    }
                },
                "data" : [
                    {
                        "created_time" : "2015-11-04T08:06:21+0000",
                        "message" : "my favorite mobiles on canvas silver",
                        "from" : {
                            "name" : "Velchamy Alagar",
                            "id" : "828304797279948"
                        },
                        "id" : "1090909130919255_1090909580919210"
                    },
                    {
                        "created_time" : "2015-11-04T08:10:13+0000",
                        "message" : "Micromax mob. मैने कुछ दिन पहले Micromax Bolt D321 mob. खरिद लिया | Bt मेरा मोबा. बहुत गरम होता है Without internate. और internate MB कम समय मेँ ज्यादा खर्च होती है | कोई तो help करो.",
                        "from" : {
                            "name" : "Amit Gangurde",
                            "id" : "1637669796485258"
                        },
                        "id" : "1090909130919255_1090910364252465"
                    },
                    {
                        "created_time" : "2015-11-04T08:10:27+0000",
                        "message" : "Nice phones.",
                        "from" : {
                            "name" : "Nayan Chavda",
                            "id" : "1678393592373659"
                        },
                        "id" : "1090909130919255_1090910400919128"
                    },
                    {
                        "created_time" : "2015-11-04T08:10:54+0000",
                        "message" : "sir micromax bolt a089 mobile ki battery price kitna. #micromax mobile",
                        "from" : {
                            "name" : "Arit Singha Roy",
                            "id" : "848776351903695"
                        },

So technically I want to store only information coming in data field:

{
                            "created_time" : "2015-11-04T08:10:54+0000",
                            "message" : "sir micromax bolt a089 mobile ki battery price kitna. #micromax mobile",
                            "from" : {
                                "name" : "Arit Singha Roy",
                                "id" : "848776351903695"
                            }

How to get this into my database?

Nikhil Parmar
  • 876
  • 2
  • 11
  • 27

1 Answers1

0

You can use the pentaho data integration open source ETL tool for this. I use it to store specific fields from the JSON output for tweets.

Select the fields you want to parse from the JSON and select an output as csv or table output in Oracle etc.

Hope this helps

enter image description here

enter image description here

Tim Ogilvy
  • 1,923
  • 1
  • 24
  • 36
ThePatBan
  • 107
  • 2
  • 15