0

My pandas dataframe looks like below:

| House | CheckDate | Reading | OrderDate | OrderedQuantity | DeliveryDate | DeliveredQuantity |
------------------------------------------------------------------------------------------------
|   A   | 2020-01-01|  43.21  |    NaT    |      NaN        |    NaT       |        NaN        |
|   A   | 2020-01-02|  43.06  |    NaT    |      NaN        |    NaT       |        NaN        |
|   A   | 2020-01-03|  42.97  |    NaT    |      NaN        |    NaT       |        NaN        |
---
|   A   | 2020-05-01|  9.82   | 2020-05-01|      50         |    NaT       |        NaN        |
|   A   | 2020-05-02|  9.65   |    NaT    |      NaN        |    NaT       |        NaN        |
---
|   A   | 2020-05-15|  0.23   |    NaT    |      NaN        |    NaT       |        NaN        |
|   A   | 2020-05-15|  25.94  |    NaT    |      NaN        |  2020-05-15  |        50         |
|   A   | 2020-05-16|  49.71  |    NaT    |      NaN        |    NaT       |        NaN        |
---
|   A   | 2020-06-17|  6.57   | 2020-06-17|      50         |    NaT       |        NaN        |
---
|   A   | 2020-07-03|  9.65   |    NaT    |      NaN        |    NaT       |        NaN        |

For every house a reading is taken every day which is the amount of a certain material leftover on that particular day. CheckDate is the date on which reading is taken. I'm trying to calculate the amount of material consumed by the house between order dates. For this particular house, there are a total of 4 orders and 4 deliveries. A delivery can happen between two order placements, so the reading value increases whenever a delivery is made.

I need to calculate the material consumption per day between order dates. This isn't just taking the difference between reading on the first order date and last order date.

To explain mathematically, the overall consumption between the first and last order date would be equal to (sum of all delivered quantities + reading on first order date - reading on last order date). This divided by total number of days between first and last order date would give consumption per day.

I'm new to Python. This is how I'd do this in SQL.

SELECT SUM(Reading) As Sum_Of_Del_Quant FROM table
WHERE House = A AND DeliveryDate != "NaT";

SELECT DATEDIFF(MAX(OrderDate), MIN(OrderDate) AS Number_of_days FROM table
WHERE House = A;

SELECT Reading AS Value_on_First_Order_Date FROM table 
WHERE House = A AND OrderDate=(SELECT MIN(OrderDate) FROM table WHERE House = A); 

SELECT Reading AS Value_on_Last_Order_Date FROM table 
WHERE House = A AND OrderDate=(SELECT MAX(OrderDate) FROM table WHERE House = A);

Can someone explain how to do this in Python?

Please see my comment below on how to do the calculation mathematically. I wasn't allowed to include that piece of information in the post here for some weird reason.

f722axzo5d
  • 21
  • 1
  • 8
  • I wanted to add another piece of information on how to calculate the consumption per day: Let's say 4 deliveries of 50 kgs quantity were made; sum of all delivered quantities would be 200 | Reading on first order date = 9.82 | Reading on last order date = 22.11 | Days between first order date and last order date = 161 | Consumption per day = (200 + 9.82 - 22.11) / 161 | I wasn't allowed to include this in the main post for some weird reason. – f722axzo5d Sep 19 '20 at 13:28
  • df[df.OrderedDate>"2020-01-01" & df.OrderedDate<"2020-05-01" ].DelieveredQuantitiy.sum() + df[df.OrderDate == "2020-05-01" ].Reading[0] . I hope you get the hint – Vishesh Mangla Sep 19 '20 at 13:56
  • Also see this answer https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates – Vishesh Mangla Sep 19 '20 at 13:56
  • Thanks for the hint. I have a slightly different question on the same topic. What if I don't know the exact dates? Forget about the sum. Let's say I want to look at the reading values on the first order date, on the day before the first delivery date, and then on the second order date, and on the day before the second delivery date, and so on. Is there any way to call the date automatically as such? – f722axzo5d Sep 19 '20 at 15:47
  • df[df.OrderDate == df.OrderDate.head(1)] .Reading. https://www.youtube.com/watch?v=yzIMircGU5I&list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y Learn pandas from here and https://www.youtube.com/watch?v=hl-TGI4550M&list=PL5-da3qGB5IBITZj_dYSFqnd_15JgqwA6 – Vishesh Mangla Sep 19 '20 at 15:55

0 Answers0