1

I have the Data frame with two set of values A & B in column 'ID'.

Need to find the smallest value from previous rows in the same columns 'MT' & 'Values'.

If 'MT' & 'Value's column current row value is smaller than its previous values the 'Matched Values' column & 'Date Values' column has to be updated with the conditions that matched.

data = {'ID': ['A', 'A', 'A', 'A','A','B', 'B', 'B', 'B','B'],
        'Date': ['02-05-2023', '03-05-2023', '04-05-2023', '05-05-2023','06-05-2023','02-05-2023', '03-05-2023', '04-05-2023', '05-05-2023','06-05-2023'],
      'MT': [10, 20, 15, 7, 25, 10, 20, 15, 7, 25],
        'Values': [100.5, 200, 150.35, 285,250,100.5, 200, 150.35, 285,250]}

df = pd.DataFrame(data)

print(df)

Available Data frame:
*******************

  ID        Date  MT  Values
0  A  02-05-2023  10  100.50
1  A  03-05-2023  20  200.00
2  A  04-05-2023  15  150.35
3  A  06-05-2023  25  250.00
4  B  02-05-2023  10  100.50
5  B  03-05-2023  20  200.00
6  B  04-05-2023  15  150.35
7  B  06-05-2023  25  250.00
  
Output Required:
*******************
    ID  Date        Mt  Values      Matched Values      Matched Dates
0   A   02-May-23   10  100.5       
1   A   03-May-23   20  200           100.5             02-May-23
2   A   04-May-23   15  150.35        100.5             02-May-23
3   A   06-May-23   25  250       100.5,200,150.35,    02-05-2023,03-05-
                                                      2023,04-05-2023                          
4   B   02-May-23   10  100.5       
5   B   03-May-23   20  200           100.5              02-May-23
6   B   04-May-23   15  150.35        100.5              02-May-23
7   B   06-May-23   25  250     100.5,200,150.35     02-05-2023,03-05-
                                                     2023,04-05-2023

Gopinathan
  • 27
  • 5
  • Please provide a [reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – mozway May 27 '23 at 10:46
  • Hi mozway, this was my first question I never knew there was a procedure. Thanks for sharing the reproducible example link. I have updated my question. – Gopinathan May 29 '23 at 07:46

1 Answers1

0

This is a straightforward solution with df.iterrows():

for i, row in df.iterrows():
    dfa = df.iloc[:i, :]  # slice with rows above current
    dfa = dfa[(dfa.ID==row.ID) & (dfa.MT < row.MT) & (dfa.Values < row.Values)] # matched rows
    df.loc[i, 'matched_values'] = ','.join(map(str, dfa.Values))
    df.loc[i, 'matched_dates'] = ','.join(map(str, dfa.Date))

There may be a more elegant solution using df.expanding().apply(). Not sure if it will be much faster though.

Poe Dator
  • 4,535
  • 2
  • 14
  • 35
  • Thank you! This does exactly what I wanted. I'm trying to add one more filter to this condition, currently the condition pulls all the smallest values. Once the current value reaches the immediate greater value the condition should break and check for the next smallest value. I'm trying to use while condition, getting error like: The truth value of a Data Frame is ambiguous. Use a. empty, a.bool(), a.item(), a.any() or a.all(). could you suggest a solution for this. – Gopinathan May 31 '23 at 08:30
  • look at similar situations discussed elsewhere at SO. example:https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o If still stuck - open a new question. With concise code example. – Poe Dator Jun 01 '23 at 13:42
  • another idea - add a separate variable, similar to ID, to mark the condition search span. – Poe Dator Jun 01 '23 at 13:44