My pandas dataframe looks like below:
| House | CheckDate | Reading | OrderDate | OrderedQuantity | DeliveryDate | DeliveredQuantity |
------------------------------------------------------------------------------------------------
| A | 2020-01-01| 43.21 | NaT | NaN | NaT | NaN |
| A | 2020-01-02| 43.06 | NaT | NaN | NaT | NaN |
| A | 2020-01-03| 42.97 | NaT | NaN | NaT | NaN |
---
| A | 2020-05-01| 9.82 | 2020-05-01| 50 | NaT | NaN |
| A | 2020-05-02| 9.65 | NaT | NaN | NaT | NaN |
---
| A | 2020-05-15| 0.23 | NaT | NaN | NaT | NaN |
| A | 2020-05-15| 25.94 | NaT | NaN | 2020-05-15 | 50 |
| A | 2020-05-16| 49.71 | NaT | NaN | NaT | NaN |
---
| A | 2020-06-17| 6.57 | 2020-06-17| 50 | NaT | NaN |
---
| A | 2020-07-03| 9.65 | NaT | NaN | NaT | NaN |
For every house a reading is taken every day which is the amount of a certain material leftover on that particular day. CheckDate is the date on which reading is taken. I'm trying to calculate the amount of material consumed by the house between order dates. For this particular house, there are a total of 4 orders and 4 deliveries. A delivery can happen between two order placements, so the reading value increases whenever a delivery is made.
I need to calculate the material consumption per day between order dates. This isn't just taking the difference between reading on the first order date and last order date.
To explain mathematically, the overall consumption between the first and last order date would be equal to (sum of all delivered quantities + reading on first order date - reading on last order date). This divided by total number of days between first and last order date would give consumption per day.
I'm new to Python. This is how I'd do this in SQL.
SELECT SUM(Reading) As Sum_Of_Del_Quant FROM table
WHERE House = A AND DeliveryDate != "NaT";
SELECT DATEDIFF(MAX(OrderDate), MIN(OrderDate) AS Number_of_days FROM table
WHERE House = A;
SELECT Reading AS Value_on_First_Order_Date FROM table
WHERE House = A AND OrderDate=(SELECT MIN(OrderDate) FROM table WHERE House = A);
SELECT Reading AS Value_on_Last_Order_Date FROM table
WHERE House = A AND OrderDate=(SELECT MAX(OrderDate) FROM table WHERE House = A);
Can someone explain how to do this in Python?
Please see my comment below on how to do the calculation mathematically. I wasn't allowed to include that piece of information in the post here for some weird reason.