My tweets Database, is MongoDB is having the following schema, I want to read this with pandas Dataframe in separate columns. I also want inside components of hashtags- text and indices.
{
"_id" : ObjectId("5a11200441f0c41f447ce56c"),
"created" : ISODate("2017-11-19T06:09:06Z"),
"text" : "#Bitcoin Hong Kong's bitcoin businesses suffer after local bank accounts frozen ,
"username" : "PennyStocksMomo",
"hashtags" : [
{
"text" : "Bitcoin",
"indices" : [
0,
8
]
}
],
"language" : "en",
"id" : "932128582767296512",
"followers" : 5715
}
"EDIT"
I used the code below.
import pymongo
import pandas as pd
from pymongo import MongoClient
client = MongoClient()
db = client.BitCoinDatabase
collection = db.tweets
data = pd.DataFrame(list(collection.find()))
_id created followers hashtags id language text username
0 5a11200441f0c41f447ce56c 2017-11-19 06:09:06 5715 [{'text': 'Bitcoin', 'indices': [0, 8]}] 932128582767296512 en #Bitcoin Hong Kong's bitcoin businesses suffer... PennyStocksMomo
1 5a11200441f0c41f447ce56d 2017-11-19 06:09:06 19526 [{'text': 'Bitcoin', 'indices': [0, 8]}] 932128583077675008 en #Bitcoin Hong Kong's bitcoin businesses suffer... CryptoTraderPro