This is an example of my data
ID | purchaseDate | numOfItemsPurchased |
---|---|---|
12 | 12-10-2023 | 2 |
12 | 12-01-2023 | 34 |
56 | 24-03-2020 | 12 |
23 | 23-12-2012 | 1 |
23 | 23-05-2012 | 3 |
23 | 23-06-2012 | 4 |
24 | 12-10-2023 | 24 |
38 | 23-02-2012 | 21 |
16 | 12-10-2023 | 34 |
54 | 02-09-2020 | |
54 | 24-03-2020 | 19 |
I want to extract the highest and lowest purchaseNo for each 'ID' and the dates they were made. the expected result should look like:
ID | maxPurchaseDate | maxPurchaseAmount | leastPurchaseDate | leastPurchaseAmount |
---|---|---|---|---|
12 | 12-01-2023 | 34 | 12-10-2023 | 2 |
56 | 24-03-2020 | 12 | 24-03-2020 | 12 |
23 | 23-06-2012 | 4 | 23-12-2012 | 1 |
24 | 12-10-2023 | 24 | 12-10-2023 | 24 |
38 | 23-02-2012 | 21 | 23-02-2012 | 21 |
16 | 12-10-2023 | 34 | 12-10-2023 | 34 |
54 | 24-03-2020 | 19 | 02-09-2020 | 11 |
Please how do I get this done?
I tried this code:
details = data.groupby('ID').min()['purchaseDate']
But I get a KeyError: 'purchaseDate'. I have double checked to be sure the column was not misspelled. and I have made sure to fillna(0)
I also tried doing it this way:
details = data.groupby('ID').min()['numOfItemsPurchased']
It doesn't throw an error but I don't know how to add more columns to it
I have tried this code:
data.groupby(['ID']).agg({'purchaseDate': [np.min,np.max],
'numOfItemsPurchased' : [np.min,np.max]})
then I get a TypeError: '<=' not supported between instances of 'int' and 'str'