Simulating online desicion making in Python

Question

EDIT:

I'm trying to simulate online decision making process. In each iteration, I want to read a new line from a known data frame and make a decision according to it. Additionally, I want to save the last n rows of the dataframe that I used. Unfortunately, even iterating through the rows is very slow.

How can I do this better?

import pandas as pd
import numpy as np
import time

t0 = time.time()
s1 = np.random.randn(2000000)
s2 = np.random.randn(2000000)
time_series = pd.DataFrame({'s1': s1, 's2': s2})
n = time_series.shape[0]

for t in range(1, n - 1):

    curr_data = time_series.iloc[t]


print time.time() - t0

OLD VERSION:

I have a loop in which in every iteration I need to delete the first row of a dataframe, and append another row to the end. What would be the fastest method to use?

There are many. Slicing, `shift`, and so on. Can you please provide a [mcve] and explain what you've tried and why it hasn't worked? — cs95, Jan 03 '18 at 07:34
Also, what you've mentioned so far screams of a bad idea. If you could do a better job describing what you are doing, you are likely to get a suggestion that could improve your overall situation substantially. — piRSquared, Jan 03 '18 at 07:35
@Roy - `make a decision according to it. Additionally, I want to save the last n rows of the dataframe that I used` - do you want apply for each row some function? and save df to files? Can you explain more? — jezrael, Jan 03 '18 at 08:08
@jezrael: Saving means that there would be an extra data frame (or another object) that would consist the last n rows that I've seen. The desicion process would be: given the new row, I would calculate its difference from the last row and use some regression model on this difference. — Roy, Jan 03 '18 at 08:14
@Roy - unfortunately there is problem avoid loops. And in soluion for each loop need previous output saved in last row + `regression model` and save to new `df` - all consume a lot of time... — jezrael, Jan 03 '18 at 08:55
I still find it difficult to believe you want to loop through one at a time and print the row. That doesn't match up with you description. If iterating through the dataframe is what you want, I'd suggest [this](https://stackoverflow.com/q/16476924/2336654) — piRSquared, Jan 03 '18 at 08:59

jezrael · Answer 1 · 2018-01-03T07:48:02.070

1

If really need it is possible use:

for i in range(3):
    #remove first row
    df = df.iloc[1:]
    #e.g. append second row
    row = df.iloc[1]
    #append new row  
    df.loc[len(df.index)] = row

But if check this post it is slowiest solution:

6) updating an empty frame (e.g. using loc one-row-at-a-time)

So I guess here should be better/faster solutions. First step is avoid loops if possible.

edited Jan 03 '18 at 07:48

answered Jan 03 '18 at 07:42

jezrael

822,522
95
1,334
1,252

Simulating online desicion making in Python

1 Answers1