0

I have several dataframes which have a score value. I need to find if any combinations of summed score values exceed a threshold.

import pandas as pd

df1 = pd.DataFrame([
    {'Node': 'A', 'Score': 1},
    {'Node': 'B', 'Score': 2},
    {'Node': 'C', 'Score': 3},
])

df2 = pd.DataFrame([
    {'Node': 'D', 'Score': 4},
    {'Node': 'E', 'Score': 5},
    {'Node': 'F', 'Score': 6},
])

df3 = pd.DataFrame([
    {'Node': 'G', 'Score': 7},
    {'Node': 'H', 'Score': 8},
    {'Node': 'I', 'Score': 9},
])

I need to check if A Score + D Score > threshold, A Score + E Score > threshold, ..., C Score + F Score > threshold.

This will get the number of combinations which exceed the threshold between two dataframes. Is there an efficient way to extend this to find the number of times the threshold has been exceeded for all dataframes and combinations greater than 2 dataframes?

threshold = 10
s1 = df1['Score']
s2 = df2['Score']

s1.apply(lambda x: (x + s2) > threshold).values.sum()

The output should be True if the threshold was exceeded in any combination of dataframes.

Evan Brittain
  • 547
  • 5
  • 15

2 Answers2

1

So if you're looking for just a True or False return then you can sum the maxima of the Dataframes and find if this exceeds your threshold. That is,

from itertools import combinations


def check_thr(threshold, *dfs):
    for df1, df2 in combinations(dfs, 2):
        if sum(df1['Score'].values.max(), df2['Score'].values.max()) > thr:
            return True
    return False

combinations(..., 2) gets all combinations of size 2 from your sequence of dataframes.

benrussell80
  • 297
  • 4
  • 10
1

This so called cross join in sql and here we can achieve it with pandas with merge

threshold = 10
df1.assign(key=1).merge(df2.assign(key=1),on='key').eval('Score_x+Score_y>@threshold').sum()

If more than two : Also please check with reduce merge

df1.assign(key=1).merge(df2.assign(key=1),on='key').merge(df3.assign(key=1),on='key').eval('Score_x+Score_y+Score>@threshold').sum()
BENY
  • 317,841
  • 20
  • 164
  • 234