Calculate maximal gapsize in list of dictionaries

Question

My data looks as follows:

data = [
{'order': 1, 'operation': 'milling', 'duration': 70, 'position': 1, 'start': 0, 'end': 70}
{'order': 1, 'operation': 'milling', 'duration': 20, 'position': 2, 'start': 200, 'end': 210}
{'order': 1, 'operation': 'milling', 'duration': 100, 'position': 2, 'start': 500, 'end': 600}
{'order': 1, 'operation': 'grinding', 'duration': 60, 'position': 3, 'start': 90, 'end': 150}
{'order': 2, 'operation': 'grinding', 'duration': 20, 'position': 1, 'start': 150, 'end': 170}
{'order': 3, 'operation': 'grinding', 'duration': 20, 'position': 1, 'start': 400, 'end': 420}
{'order': 3, 'operation': 'milling', 'duration': 50, 'position': 1, 'start': 610, 'end': 660}
]

Now I want to calculate the maximum gaps of each operation. Operation 'milling' has its maximum gap between 210 and 500. Operation 'grinding' has its maximum gap between 170 and 400.

How to extract these maximum gaps to a new dictionary?

max_gaps = [
{'operation': 'milling', 'max_gap': 290, 'start': 210, 'end': 500}
{'operation': 'grinding', 'max_gap': 230, 'start': 170, 'end': 400}

]

From the tags, it looks like you want to use Pandas and/or NumPy, right? What have you started on? For example, have you put the data into a dataframe? — wjandrea, Jul 20 '23 at 20:07
Should the dictionaties be separated by order number or only operation type? E.g. would order 3 count towards all the milling operations? — B Remmelzwaal, Jul 20 '23 at 20:09
Do you know how to use groupby? What have you tried, and where are you stuck? — wjandrea, Jul 20 '23 at 20:10
Check out [ask], which has tips like making a [mre]. The other fields don't seem to be relevant (`order`, `position`, maybe `duration`), so please remove them. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). — wjandrea, Jul 20 '23 at 20:12
Is it safe to assume the data is sorted by start? I don't think it's a big deal, just an extra step. — wjandrea, Jul 20 '23 at 20:12

score 1 · Accepted Answer · answered Jul 20 '23 at 20:11

You have tagged your question as pandas so I'm assuming you have a pandas Dataframe:

df['start'] = df['start'].shift(-1)
df = df.groupby('operation').apply(lambda x: x.loc[(x['start'] - x['end']).idxmax()])[['operation', 'end', 'start']].reset_index(drop=True)
df.columns = ['operation', 'start', 'end']
df['max_gap'] = df['end'] - df['start']

print(df)

Prints:

  operation  start    end  max_gap
0  grinding    170  400.0    230.0
1   milling    210  500.0    290.0

Initial df:

   order operation  duration  position  start  end
0      1   milling        70         1      0   70
1      1   milling        20         2    200  210
2      1   milling       100         2    500  600
3      1  grinding        60         3     90  150
4      2  grinding        20         1    150  170
5      3  grinding        20         1    400  420
6      3   milling        50         1    610  660

How to avoid the following error see my thread https://stackoverflow.com/questions/76751868/how-to-avoid-key-error-if-new-operation-is-added-which-only-exists-once ? Thanks. — question12, Jul 24 '23 at 06:29

Calculate maximal gapsize in list of dictionaries

1 Answers1