I have two columns of data restaurant name and reviewer grades:
name grades
0 Honey'S Thai Pavilion [{u'date': 2014-08-12 00:00:00, u'grade'..
1 Siam Sqaure Thai Cuisine [{u'date': 2014-11-06 00:00:00, u'grade'...
The problem is one column is a list of multiple 'date,' 'grade' and 'score' pairings in JSON (well technically BSON since this is the sample data set from the MongoDB tutorial). I need to break out the grades column so I have a resulting data frame like below:
name Date Grade Score
Honey'S Thai Pavilion 2014-08-12 00:00:00 A 6
Honey'S Thai Pavilion 2015-03-14 00:00:00 B 5
Honey'S Thai Pavilion 2013-07-15 00:00:00 C 6
Siam Sqaure Thai Cuisine 2014-11-06 00:00:00 A 3
Siam Sqaure Thai Cuisine 2015-06-06 00:00:00 B 2
So I need to split out one column but retain the restaurant name. The code below achieves getting the grades column into a nice looking data frame, but I can't figure out how to keep the restaurant name.
from pymongo import MongoClient
import pymongo
import pandas as pd
client = MongoClient()
db = client.test
)
cursor2 = db.restaurants.find().sort([
("borough", pymongo.ASCENDING),
("cuisine", pymongo.DESCENDING)
])
#cursor.sort("cuisine",pymongo.ASCENDING)
data = pd.DataFrame(list(cursor2))[['name', 'grades']]
data_list= []
for i in range(0, len(data.grades)):
g_data = pd.DataFrame(data.grades[i])
data_list.append(g_data)
result = pd.concat(data_list)
print result.head(100)