1

I found this question but I couldn't understand the problem and I couldn't adjust it for my case:

Usage example:

let's assume that I want to get the value of competition and market_name from line 5 of my CSV and filter lines 1,2,3 and 4 that have the same values in these columns and sum the column back, if back is > 0, then the new column called invest will show TRUE, otherwise FALSE. I want to do this by going through all the lines of the Dataframe, always calculating only the previous lines. The idea is to know whether or not it would be profitable to invest based on existing records in the history.

In order for the code to become malleable for different types of filters, I did it this way:

My 100% executable code for testing:

import pandas as pd
from functools import reduce

df = pd.read_csv('test.csv')
df = df[df['result'].notnull()].reset_index(drop=True)

options = [
    'country',
    'competition',
    'market_name',
    'runner_name',
    'odds',
    'total_matched_vls',
    'minute_traded_vls'
]

def my_func(cb, i):
    dfilter = df[:i]
    filter = dfilter[reduce(lambda x, y : x & (df[y[0]] == y[1]), zip(cb, map(lambda x: df.iloc[i][x], cb)), True)]
    back_sum = filter['back'].sum()
    aaa = True
    if back_sum <= 0:
        aaa = False
    return aaa

def combs(a):
    if len(a) == 0:
        return [[]]
    cs = []
    for c in combs(a[1:]):
        cs += [c, c+[a[0]]]
    return cs

combinations = combs(options)
combinations = [x for x in combinations if x != []]

for cbnt in combinations:
    df['invest'] = [my_func(cbnt,i) for i in range(len(df))]
    true_backs = df[(df['invest'] == True)]['back']
    if (true_backs.sum() > 0):
        print(cbnt)

CSV sample example:

clock_now,country,competition,match_id,match_name,market_id,market_name,runner_id,runner_name,total_matched,total_matched_vls,one,minute_traded,minute_traded_vls,odds,result,back,lay
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1
2022/05/22 12:16,GB,English Premier League,31459682,Brighton v West Ham,1.199215861,First Half Goals 0.5,5851483,Over 0.5 Goals,2692.37,milhares,['149.12'♫§ '7.039999999999964'♫§ '48.190000000000055'♫§ '24.08999999999992'♫§ '75.65000000000009'],1797.9199999999998,milhares,1.7,WINNER,0.6545,-1
2022/05/22 12:20,HU,Hungarian NB II,31473686,Vasas v Soroksar,1.199438855,Over/Under 6.5 Goals,2542448,Under 6.5 Goals,5801.98,milhares,['125.76'♫§ '150.4699999999998'♫§ '200.79'♫§ '51.43000000000029'♫§ '478.25999999999976'],3011.07,milhares,1.01,LOSER,0.00935000000000001,-1
2022/05/22 12:21,LU,Luxembourg Division Nationale,31473752,FC Differdange 03 v Progres Niedercorn,1.199439863,Over/Under 0.5 Goals,5851483,Over 0.5 Goals,4451.85,milhares,['0.0'♫§ '12.889999999999873'♫§ '22.44000000000005'♫§ '1.2899999999999636'♫§ '219.71000000000004'],1038.0600000000004,milhares,1.71,WINNER,0.66385,-1

But i receive this Warning:

UserWarning: Boolean Series key will be reindexed to match DataFrame index.

filter = dfilter[reduce(lambda x, y : x & (df[y[0]] == y[1]), zip(cb, map(lambda x: df.iloc[i][x], cb)), True)]

If I need the Series to generate the malleable option, how should I proceed to not receive this warning?

Digital Farmer
  • 1,705
  • 5
  • 17
  • 67
  • your code looks like it could be greatly simplified, which would remove the warning, can you summarize what you are trying to achieve? – mozway Dec 10 '22 at 11:49
  • I've updated the question (add example usage) and the code to be even more verbose in terms of how it traverses and creates the column combinations used to filter. @mozway – Digital Farmer Dec 10 '22 at 12:03
  • That's what I thought, you want a `cumsum` per group, see answer below – mozway Dec 10 '22 at 12:09

1 Answers1

2

It looks like you want a groupby.cumsum, and compare is to the current value:

m1 = df.groupby('country')['back'].cumsum().gt(df['back'])
m2 = df.groupby(['competition','market_name'])['back'].cumsum().gt(df['back'])

df['invest'] = m1&m2

For many conditions:

from functools import reduce

combinations = [['country'],['competition','market_name']]

df['invest'] = reduce(np.logical_and,
                      (df.groupby(g)['back'].cumsum().gt(df['back'])
                       for g in combinations)
                     )

Output (invest column only):

0     False
1     False
2     False
3      True
4      True
5      True
6      True
7      True
8      True
9      True
10     True
11     True
12     True
13     True
14     True
Name: back, dtype: bool
mozway
  • 194,879
  • 13
  • 39
  • 75
  • 1
    In addition to solving the problem, it made my code countless times faster, it took hours to go through all the combination options (my CSV has almost 1 million lines), now it ends in a few seconds. – Digital Farmer Dec 10 '22 at 12:21
  • 3
    Glad it works, remember me if you become millionaire with online betting :p – mozway Dec 10 '22 at 12:23
  • 1
    I certainly won't forget @mozway , I couldn't check the investment possibility in real time because it took hours to go through the combinations every time I found an investment option, now I can, and this I can't forget that! If everything goes well, I'll use your ''buymeacoffee''. haha – Digital Farmer Dec 10 '22 at 12:31
  • 1
    I hope this faster code won't trigger two many risky investments, play responsibly ;) – mozway Dec 10 '22 at 12:35