i try to create with scipy.sparse a matrix from json file.
I have json file in this way
{"reviewerID": "A10000012B7CGYKOMPQ4L", "asin": "000100039X", "reviewerName": "Adam", "helpful": [0, 0], "reviewText": "Spiritually and mentally inspiring! A book that allows you to question your morals and will help you discover who you really are!", "overall": 5.0, "summary": "Wonderful!", "unixReviewTime": 1355616000, "reviewTime": "12 16, 2012"}
this is my Json format...more elements like this(based on Amazon Review file)
and want performe a scipy sparse for have this matrix
count
object a b c d
id
him NaN 1 NaN 1
me 1 NaN NaN 1
you 1 NaN 1 NaN
i m trying to do this
i
mport numpy as np
import pandas as pd
from scipy.sparse import csr_matrix
df= pd.read_json('C:\\Users\\anto-\\Desktop\\university\\Big Data computing\\Ex. Resource\\test2.json',lines=True)
a= df['reviewerID']
b= df['asin']
data= df.groupby(["reviewerID"]).size()
row = df.reviewerID.astype('category', categories=a).cat.codes
col = df.asin.astype('category', categories=b).cat.codes
sparse_matrix = csr_matrix((data, (row, col)), shape=(len(a), len(b)))
reading from this old example
Efficiently create sparse pivot tables in pandas?
I have some error for deprecates element in my code, but i dont underestand how to costruct this matrix.
this is the error log:
FutureWarning: specifying 'categories' or 'ordered' in .astype() is deprecated; pass a CategoricalDtype instead
from ipykernel import kernelapp as app
I m bit confused. Anyone can give me some suggestion or similar example?