0

Here is my dataset.

emp_id|emp_name|work_start_date|work_end_date|days_worked_on|
------+--------+---------------+-------------+--------------+
     1|A       |     2023-03-01|   2023-03-03|             2|
     2|B       |     2023-03-04|   2023-03-04|             0|

Actually i want to calculate how many days the person worked on. so i calculated based on below query.

select 
    emp_id
    ,emp_name
    ,work_start_date
    ,work_end_date   
    ,work_end_date - work_start_date as days_worked_on
from demo.stackquestions;   

Ideally, the person-A worked on 03/01,03/02 and 03/03.Totally 3 days. But when calculate work_end_date - work_start_date , it is giving 2 days.(Technically Yes. But in reality the answer is wrong.) To calculate right days, do i need to add 1 (+1) at the end,

work_end_date - work_start_date +1 as days_worked_on

Is this right approach. Please share your thought on this.

nacho
  • 5,280
  • 2
  • 25
  • 34
Learn Hadoop
  • 2,760
  • 8
  • 28
  • 60
  • Postgresql, mysql, oracle?? Why do you tag so many systems?? – nacho Mar 08 '23 at 19:45
  • 3
    If it suits your needs, why not? Add 1 day to the result. – Littlefoot Mar 08 '23 at 19:45
  • it is just ANSI standard. hence i added all the DB's – Learn Hadoop Mar 08 '23 at 19:46
  • is it right approach.. or anyother built-in functon from postgresql . I tried AGE no luck – Learn Hadoop Mar 08 '23 at 19:51
  • I vote: Add a 1. I can't imagine where that would fail and there is no built in `subtract_these_dates_but_consider_the_inherent_time_dimension_for_the_first_date_to_be_12am_and_the_inherent_time_dimension_for_the_second_date_to_be_11:59:59.999pm(start_date, end_date)`. – JNevill Mar 08 '23 at 19:52
  • If work has been done on `work_end_date`, then the actual end of the work period would be the end of that day: so `+1` is needed. – scivi Mar 08 '23 at 19:53
  • There are some answered posts that relate to yours, hopefully they'll help you solve your problem. Check [this](https://stackoverflow.com/questions/24929735/how-to-calculate-date-difference-in-postgresql) and [this](https://stackoverflow.com/questions/46204351/calculate-difference-between-dates-postgres). – lemon Mar 08 '23 at 19:53
  • `select '2023-03-04'::date - '2023-03-04'::date + 1;` yields `1` not `0`. – Adrian Klaver Mar 08 '23 at 20:35
  • I think +1 is perfectly reasonable since your `work_start_date` entries probably imply a timestamp corresponding to the time the shift starts, `work_end_date` the end of the shift, 8h apart (or some other standard shift length) - comparing only the `date` part of these timestamps yields 0, even though 8h is implied. "+1" therefore means "there's always at least 1 *workday* between minuend and subtrahend". – Zegarek Mar 08 '23 at 21:08

1 Answers1

1

Using generate_series to get a date count over a range of dates. This assumes 2023-03-04, 2023-03-04 equals one day of work.

CREATE TABLE stackquestions (
    emp_id integer,
    emp_name varchar,
    work_start_date date,
    work_end_date date
);

INSERT INTO stackquestions
    VALUES (1, 'A', '2023-03-01', '2023-03-03');

INSERT INTO stackquestions
    VALUES (2, 'B', '2023-03-04', '2023-03-04');

WITH date_ct AS (
    SELECT
        sq.emp_id,
        count(a) AS days_worked_on
    FROM
        stackquestions AS sq,
        generate_series(work_start_date, work_end_date, '1 day') AS t (a)
    GROUP BY
        emp_id
)
SELECT
    stackquestions.*,
    date_ct.days_worked_on
FROM
    stackquestions
    JOIN date_ct ON stackquestions.emp_id = date_ct.emp_id;

emp_id | emp_name | work_start_date | work_end_date | days_worked_on 
--------+----------+-----------------+---------------+----------------
      1 | A        | 03/01/2023      | 03/03/2023    |              3
      2 | B        | 03/04/2023      | 03/04/2023    |              1

Adrian Klaver
  • 15,886
  • 2
  • 17
  • 28