0

I have an initial Pandas dataframe with 29 columns of interest that I, unfortunately, have to assign to individual variables manually:

data= pd.read_csv('data.csv')

Prediction0 = data.ix[:, 'prediction0'].tolist()
Prediction1 = data.ix[:, 'prediction1'].tolist()
.....
Prediction29 = data.ix[:, 'prediction29'].tolist()

Now, I would like to put these variables in a dictionary

self.limit = len(data)
self.history=[]

for i in range(0,self.limit): 
        self.history.append({'Prediction0': Prediction0[i], 'Prediction1': Prediction1[i], 'Prediction2': Prediction2[i], 'Prediction3': Prediction3[i], 'Prediction4': Prediction4[i], 'Prediction5': Prediction5[i], 'Prediction6': Prediction6[i], 'Prediction7': Prediction7[i], 'Prediction8': Prediction8[i], 'Prediction9': Prediction9[i], 'Prediction10': Prediction10[i], 'Prediction11': Prediction11[i], 'Prediction12': Prediction12[i], 'Prediction13': Prediction13[i], 'Prediction14': Prediction14[i], 'Prediction15': Prediction15[i], 'Prediction16': Prediction16[i], 'Prediction17': Prediction17[i], 'Prediction18': Prediction18[i], 'Prediction19': Prediction19[i], 'Prediction20': Prediction20[i], 'Prediction21': Prediction21[i], 'Prediction22': Prediction22[i], 'Prediction23': Prediction23[i], 'Prediction24': Prediction24[i], 'Prediction25': Prediction25[i], 'Prediction26': Prediction26[i], 'Prediction27': Prediction27[i], 'Prediction28': Prediction28[i], 'Prediction29': Prediction29[i]})

Later on, this dictionary becomes a numpy Array

predictionList=numpy.array([list(map(lambda x: ((x["Prediction0"], x["Prediction1"], x["Prediction2"], x["Prediction3"], x["Prediction4"], x["Prediction5"], x["Prediction6"], x["Prediction7"], x["Prediction8"], x["Prediction9"], x["Prediction10"], x["Prediction11"], x["Prediction12"], x["Prediction13"], x["Prediction14"], x["Prediction15"], x["Prediction16"], x["Prediction17"], x["Prediction18"], x["Prediction19"], x["Prediction20"], x["Prediction21"], x["Prediction22"], x["Prediction23"], x["Prediction24"], x["Prediction25"], x["Prediction26"], x["Prediction27"], x["Prediction28"], x["Prediction29"])),self.history[index]))])

As you can see, I have to instantiate and manipulate each of these variables manually one by one.

Now, unfortunatelly, I have a new Pandas Dataframe with 990 columns of interest (the other 10 I dont want). As you might imagine, instantiating and manipulating manually such number of variables in the above code seems to be quite unfeasible. Is there any efficient way to do such tasks with a very high number of variables in Python?

mad
  • 2,677
  • 8
  • 35
  • 78
  • 1
    Why did not you run a loop from 0 to 29? – Austin Mar 30 '20 at 09:42
  • 1
    Correct me If I am wrong. You just want to convert the csv data into numpy array? – Chayan Bansal Mar 30 '20 at 09:43
  • How to add new variables and adapt each part of my code to such a loop? that's what I dont know how to do :-( – mad Mar 30 '20 at 09:43
  • @ChayanBansal I want to convert just some columns of the dataframe and put them in a numpy array one by one in a for loop (like the last part of my code that works for 29 columns). – mad Mar 30 '20 at 09:44
  • 1
    @mad You seem to be doing a lot of work for no apparent reason? I'm not sure... but if columns contain a common pattern then maybe something like: `prediction_list = data.filter(regex='^prediction\d+$').values` might be it? – Jon Clements Mar 30 '20 at 09:52
  • @JonClements the question here is that some functions that I will use accept numpy arrays only as input, others only accept dictionaries. I am trying your suggestion, thank you. – mad Mar 30 '20 at 09:55
  • 1
    @mad the `.values` attribute of a dataframe/series gives you the underlying numpy array(s) – Jon Clements Mar 30 '20 at 09:56

0 Answers0