0

I have twitter account timeline data per tweet saved in .json format, I am unable to save the data into mongodb

Example: fetched data of one tweet.

{
  "created_at": "Fri Apr 12 05:13:35 +0000 2019", 
  "id": 1116570031511359489, 
  "id_str": "1116570031511359489", 
  "full_text": "@jurafsky How can i get your video lectures related to   Sentiment Analysis", 
  "truncated": false, 
  "display_text_range": [0, 73], 
  "entities": { 
    "hashtags": [], 
    "symbols": [], 
    "user_mentions": [
      {
        "screen_name": "jurafsky", 
        "name": "Dan Jurafsky", 
        "id": 14968475, 
        "id_str": "14968475", 
        "indices": [0, 9]
      }
    ], 
  "urls": []
}

it also contains urls and other lost of information

I have tried the following code.

from pymongo import MongoClient
import json

client=MongoClient('localhost',27107)
db=client.test
coll=db.dataset
with open('tweets.json') as f:
    file_data=json.loads(f.read())
coll.insert(file_data)
client.close()
Matt
  • 2,063
  • 1
  • 14
  • 35

2 Answers2

1

Try this:

from pymongo import MongoClient
import json

client=MongoClient('localhost',27107)
db=client.test
coll=db.dataset
with open('tweets.json') as f:
    file_data=json.load(f)
coll.insert(file_data)
client.close()
Saxon
  • 739
  • 3
  • 6
0

My json dataset was not valid, I have to merge it to one array object

Thanks to: Can't parse json file: json.decoder.JSONDecodeError: Extra data.