I have a csv file with words and their tf-idf scores. I am writing a method to normalize the values (to make them between 0 and 1 ). I am using Pandas
library of python. The data is read as dataframe
object of Pandas. When I try to run the code, I get an exception-"ValueError: too many boolean indices". Could you please tell me what is going wrong. I went through a couple of answers on multiple forums, but could not relate to what I am facing.
This is the line where I get the error: dtm_norm=(dtm-min)/(diffMaxMin)
This is the data format-
index 0
0 abbaiah 0.121030858
1 abbaiah_reddi 0.121030858
2 abbaiah_reddi_kaggadasapura 0.121030858
This is the code:
def normalizeValues(inputpath):
outputpath=inputpath+'normalized\\'
allFiles = glob.glob(inputpath+"\\*.csv")
for file in allFiles:
fileName=file.split('\\')[-1:][0]
dtm=pd.read_csv(file)
min=dtm.min(numeric_only='true')
max=dtm.max(numeric_only='true')
diffMaxMin=max-min
dtm_norm=(dtm-min)/(diffMaxMin)
writeToCsv(dtm_norm,outputpath+fileName)