0
import pandas as pd

df_reader = pd.read_json('Clothing_Shoes_and_Jewelry.json', lines =  True ,chunksize = 1000000 )

counter = 1
for chunk in df_reader:
    new_df = pd.DataFrame(chunk[['overall', 'reviewText','summary']])
    
    new_df1 = new_df[new_df['overall' == 1]].sample(4000)
    new_df2 = new_df[new_df['overall' == 2]].sample(4000)
    new_df3 = new_df[new_df['overall' == 4]].sample(4000)
    new_df4 = new_df[new_df['overall' == 5]].sample(4000)
    new_df5 = new_df[new_df['overall' == 3]].sample(8000)
    
    new_df6 = pd.concat([new_df1, new_df2, new_df3, new_df4, new_df5], axis = 0,ignore_index = True)
    
    new_df6.to_csv(str(counter)+'.csv', index = False)
    counter = counter+1
    
    

from glob import glob
#the glob module is used to retrieve the files
#or pathnames matching a pattern

filenames = glob('*.csv')

#['1.csv','2.csv',..........,'33.csv']

dataframes = []

for f in filenames:
    dataframes.append(pd.read_csv(f))

#[..........]
    

finaldf = pd.concat(dataframes, axis = 0, ignore_index = True)

finaldf.to_csv("balanced_reviews.csv", index = False)


#---------------------------------

df = pd.read_csv('balanced_reviews.csv')

I get a ValueError: Expected object or value when getting a chunk from df_reader

Cimbali
  • 11,012
  • 1
  • 39
  • 68
  • Hi and welcome to StackOverflow ! Please write up your code and errors in full and [do not use images](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question). I’ve fixed that for you already, but there’s some more work to make your question answerable: please read through [this help page on making a minimal example](https://stackoverflow.com/help/minimal-reproducible-example). You probably only need 4-5 lines of code and an example json file that causes your issue. – Cimbali Jun 09 '21 at 13:23

1 Answers1

0

The error usually occurs when either the file is not referenced correctly or if your JSON in itself is malformed. Like @Cimbali mentioned above - if are allowed to copy a sample JSON then it would help additionally. Meantime check these answers from a related question from Stackoverflow itself earlier - [(ValueError: Expected object or value when reading json as pandas dataframe)]

Vivek
  • 161
  • 3