-1

My data looks as follows:

data = [
{'order': 1, 'operation': 'milling', 'duration': 70, 'position': 1, 'start': 0, 'end': 70}
{'order': 1, 'operation': 'milling', 'duration': 20, 'position': 2, 'start': 200, 'end': 210}
{'order': 1, 'operation': 'milling', 'duration': 100, 'position': 2, 'start': 500, 'end': 600}
{'order': 1, 'operation': 'grinding', 'duration': 60, 'position': 3, 'start': 90, 'end': 150}
{'order': 2, 'operation': 'grinding', 'duration': 20, 'position': 1, 'start': 150, 'end': 170}
{'order': 3, 'operation': 'grinding', 'duration': 20, 'position': 1, 'start': 400, 'end': 420}
{'order': 3, 'operation': 'milling', 'duration': 50, 'position': 1, 'start': 610, 'end': 660}
]

Now I want to calculate the maximum gaps of each operation. Operation 'milling' has its maximum gap between 210 and 500. Operation 'grinding' has its maximum gap between 170 and 400.

How to extract these maximum gaps to a new dictionary?

max_gaps = [
{'operation': 'milling', 'max_gap': 290, 'start': 210, 'end': 500}
{'operation': 'grinding', 'max_gap': 230, 'start': 170, 'end': 400}

]
Ro.oT
  • 623
  • 6
  • 15
question12
  • 177
  • 8
  • From the tags, it looks like you want to use Pandas and/or NumPy, right? What have you started on? For example, have you put the data into a dataframe? – wjandrea Jul 20 '23 at 20:07
  • Should the dictionaties be separated by order number or only operation type? E.g. would order 3 count towards all the milling operations? – B Remmelzwaal Jul 20 '23 at 20:09
  • Do you know how to use groupby? What have you tried, and where are you stuck? – wjandrea Jul 20 '23 at 20:10
  • Check out [ask], which has tips like making a [mre]. The other fields don't seem to be relevant (`order`, `position`, maybe `duration`), so please remove them. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Jul 20 '23 at 20:12
  • Is it safe to assume the data is sorted by start? I don't think it's a big deal, just an extra step. – wjandrea Jul 20 '23 at 20:12
  • Yes data is sorted by start. – question12 Jul 20 '23 at 20:41
  • Why not have the output be a dict keyed by operation? – Mad Physicist Jul 20 '23 at 22:17

1 Answers1

1

You have tagged your question as so I'm assuming you have a pandas Dataframe:

df['start'] = df['start'].shift(-1)
df = df.groupby('operation').apply(lambda x: x.loc[(x['start'] - x['end']).idxmax()])[['operation', 'end', 'start']].reset_index(drop=True)
df.columns = ['operation', 'start', 'end']
df['max_gap'] = df['end'] - df['start']

print(df)

Prints:

  operation  start    end  max_gap
0  grinding    170  400.0    230.0
1   milling    210  500.0    290.0

Initial df:

   order operation  duration  position  start  end
0      1   milling        70         1      0   70
1      1   milling        20         2    200  210
2      1   milling       100         2    500  600
3      1  grinding        60         3     90  150
4      2  grinding        20         1    150  170
5      3  grinding        20         1    400  420
6      3   milling        50         1    610  660
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • How to avoid the following error see my thread https://stackoverflow.com/questions/76751868/how-to-avoid-key-error-if-new-operation-is-added-which-only-exists-once ? Thanks. – question12 Jul 24 '23 at 06:29