0

I have two tables named author and commit_metrics. Both of them have an id field. Author has author_name and author_email. Commit_metrics has author_id and author_date.

I am trying to write a query that will get the number of commits that each author had in a given week, even if that number is 0. Here's what I have so far:

SELECT a.id, a.author_name, a.author_email, c.week_num, COUNT(c.id)
FROM author AS a
     CROSS JOIN generate_series(1, 610) AS s(n)
     LEFT JOIN  (SELECT c.id,
                        c.author_id,
                        c.author_date,
                        WEEK_NUMBER(c.author_date) AS week_num
                 FROM commit_metrics c) AS c ON s.n = c.week_num AND a.id = c.author_id
WHERE c.week_num IS NOT NULL
GROUP BY a.id, a.author_name, a.author_email, c.week_num
ORDER BY c.week_num DESC, a.author_name;

WEEK_NUMBER is a function I wrote for this query:

CREATE OR REPLACE FUNCTION WEEK_NUMBER(date TIMESTAMP) RETURNS INTEGER AS
$$
SELECT TRUNC(DATE_PART('day', date - '2008-01-01') / 7)::INTEGER;
$$ LANGUAGE SQL;

Currently, the query works like a charm with one major caveat. It doesn't properly calculate 0 when the author made no commits in a given week. I'm not sure why it doesn't. When I do the query with just the FROM and CROSS JOIN, it properly prints the many thousand combined authors/weeks. However, when I add the LEFT JOIN, it loses any week where the author did not make a commit.

Any help would be greatly appreciated. I'm open to doing away with the generate_series call if it's unnecessary.

Also, I found this post, but I don't think it's helpful for my case.

Michael Ziluck
  • 599
  • 6
  • 19
  • 2
    BTW you don't need to create a function to calculate week number. PostgreSQL already have a function for that. https://stackoverflow.com/questions/34050103/extract-week-number-from-date-postgres – Juan Carlos Oropeza Sep 12 '19 at 17:48
  • Possible duplicate of [Left Outer Join doesn't return all rows from my left table?](https://stackoverflow.com/questions/4707673/left-outer-join-doesnt-return-all-rows-from-my-left-table) – philipxy Sep 14 '19 at 04:03
  • Learn what LEFT JOIN ON returns: INNER JOIN ON rows UNION ALL unmatched left table rows extended by NULLs. Always know what INNER JOIN you want as part of an OUTER JOIN. A WHERE or INNER JOIN ON that requires a right [sic] table column to be not NULL after an OUTER JOIN ON removes any rows extended by NULLs, ie leaves only INNER JOIN ON rows, ie "turns OUTER JOIN into INNER JOIN". You have that. – philipxy Sep 14 '19 at 04:04
  • @philipxy Despite using SQL for many years, I've never needed a `CROSS JOIN` before. Since it was something new I was using, I had assumed that my bug was something higher-level than me just being a dummy and filtering using the `LEFT JOIN` rather than the `CROSS JOIN`. Simply removing the `WHERE` clause actually doesn't fix my problem as I _also_ needed to change the `SELECT` to use `s.n` rather than `c.week_num`. The link (and your comment) would have helped me get there, but this is not _completely_ a duplicate. Would you agree/disagree? If you agree, can you unflag my question? Cheers :) – Michael Ziluck Sep 16 '19 at 21:25
  • I gave a dupe link for 1 obvious problem. (Moving tests from `where` to `on` is *not* a magic fix, that's just poor answers guessing & not explaining. And we usually *don't know* the fix because of no clear intended spec for the subexpression ending with `where`--eg here.) But this post should still be closed (have answers blocked until clear) for lack of a [mre] that pins down a first point of wrongness that we could say is "the" question. The dupe does apply & is even the deepest/1st error & I can't change (only drop) a close vote & *some* close vote is warranted until you give a MRE. – philipxy Sep 17 '19 at 02:28
  • Don't worry, my close vote will soon fade away unless 4 other people meanwhile think & vote that this is/involves a dupe or lacks MRE or is too broad (has what we could call multiple problems) or has another appropriate problem. People *should not answer* & should downvote a closeworthy question but they answer anyway. In this case your overall spec is clear enough & simple enough that they can just write correct code for you. How "useful" (or "researched") is a question that is just wrong code with no 1st point of error or even test framework? (See [ask] & the vote arrow mouseover texts.) – philipxy Sep 17 '19 at 02:46

3 Answers3

2

Although you are using a left join, "WHERE c.week_num IS NOT NULL" filters out all of the cases where there is no post. Try this:

SELECT a.id, a.author_name, a.author_email, s.n as week_num, COUNT(c.id) as post_count
FROM author AS a
     CROSS JOIN generate_series(1, 610) AS s(n)
     LEFT JOIN  (SELECT c.id,
                        c.author_id,
                        c.author_date,
                        WEEK_NUMBER(c.author_date) AS week_num
                 FROM commit_metrics c) AS c ON s.n = c.week_num AND a.id = c.author_id
GROUP BY a.id, a.author_name, a.author_email, s.n
ORDER BY s.n DESC, a.author_name;
Charles Fox
  • 261
  • 2
  • 10
0

Your WHERE clause is excluding the records on commit_metrics that are null, which is the case when the author has no commits during the week selected. You should just remove this from the WHERE clause to get your desired output.

If you need the WHERE clause to eliminate some of the CROSS JOIN records based on your data, you will need that CROSS JOIN and WHERE to be in a sub-select that you LEFT JOIN to, or create some more complicated logic in the current WHERE clause.

0

Remove the filtering condition. Also a subquery is not needed and you want to select s.n instead of c.week_num:

SELECT a.id, a.author_name, a.author_email, s.n as week_num, COUNT(c.id)
FROM author a CROSS JOIN
     generate_series(1, 610) AS s(n) LEFT JOIN
     commit_metrics c
     ON s.n = WEEK_NUMBER(c.author_date) AND a.id = c.author_id
GROUP BY a.id, a.author_name, a.author_email, c.week_num
ORDER BY c.week_num DESC, a.author_name;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786