0

I have query like this

SELECT cus.id, cus.crawler_url_id, cus.created_at, cus.status
FROM crawler_url_stats cus
INNER JOIN (
    SELECT id, crawler_url_id, max( created_at ) latest
    FROM crawler_url_stats
    GROUP BY crawler_url_id
) cus2 ON ( cus.crawler_url_id = cus2.crawler_url_id
AND cus.created_at = cus2.latest )
ORDER BY `cus`.`crawler_url_id` ASC 

It is a little modified version of query proposed in one of other topics, I just changed the GROUP BY in inner query to crawler_url_id(it is not id) and changed matching of join to crawler_url_id too.

Everything seems to work good, but there is a problem when the row created_at is empty for one of the crawler_url_id, for example I have 4 rows with crawler_url_id equal to 5 and if just one of them has created_at equal to null then the results won't contain the one with this 5 crawler_url_id.

Seems like strange bahaviour for me, first I tried the code from this post https://stackoverflow.com/a/7745635/1951693, but it returned wrong results(multiple results per crawler_url_id) so I modified it and now it just struggles when there is null there.

Community
  • 1
  • 1
Maksym
  • 3,276
  • 3
  • 22
  • 27

2 Answers2

1

Try to exclude rows with NULL from the subquery:

SELECT cus.id, cus.crawler_url_id, cus.created_at, cus.status
FROM crawler_url_stats cus
INNER JOIN (
    SELECT id, crawler_url_id, max( created_at ) latest
    FROM crawler_url_stats
    WHERE created_at IS NOT NULL
    GROUP BY crawler_url_id
) cus2 ON ( cus.crawler_url_id = cus2.crawler_url_id
AND cus.created_at = cus2.latest )
ORDER BY `cus`.`crawler_url_id` ASC 
Bulat
  • 6,869
  • 1
  • 29
  • 52
0

Setting max(ifnull(created_at, 0)) in both subquery and instead of cus.created_at works great. Solved.

Maksym
  • 3,276
  • 3
  • 22
  • 27