5
+-------------------------------------------------+-----------------+---------------------+
| landing_page                                    | all_impressions | dates               |
+-------------------------------------------------+-----------------+---------------------+
| https://www.example.co.uk/url-1                 |           53977 | 2018-08-19 13:59:40 |
| https://www.example.co.uk/url-1                 |             610 | 2018-09-19 13:59:40 |
| https://www.example.co.uk/url-1                 |            555  | 2018-10-19 13:59:40 |
| https://www.example.co.uk/url-1                 |            23   | 2018-11-19 13:59:40 |
| https://www.example.co.uk/                      |            1000 | 2018-06-19 13:59:40 |
| https://www.example.co.uk/                      |               2 | 2018-07-19 13:59:40 |
| https://www.example.co.uk/                      |               4 | 2018-08-19 13:59:40 |
| https://www.example.co.uk/                      |            1563 | 2018-09-19 13:59:40 |
| https://www.example.co.uk/                      |               1 | 2018-10-19 13:59:40 |
| https://www.example.co.uk/                      |            9812 | 2018-11-19 13:59:40 |
+-------------------------------------------------+-----------------+---------------------+

With the above database table, I only want to select the landing_page if the impression count is the max for the current date - For example, from this, the select would return https://www.example.co.uk/ only as the current month it's all_impressions value is it's highest for November (https://www.example.co.uk/url-1  would not be selected as it's highest value was in August)

How might I do this with SQL?

index info:

mysql> show indexes from landing_pages_client_v3;
+-------------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                   | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| landing_pages_client_v3 |          0 | PRIMARY      |            1 | id          | A         |    24279939 |     NULL | NULL   |      | BTREE      |         |               |
| landing_pages_client_v3 |          1 | profile_id   |            1 | profile_id  | A         |          17 |     NULL | NULL   | YES  | BTREE      |         |               |
| landing_pages_client_v3 |          1 | profile_id   |            2 | dates       | A         |          17 |     NULL | NULL   |      | BTREE      |         |               |
| landing_pages_client_v3 |          1 | profile_id_2 |            1 | profile_id  | A         |          17 |     NULL | NULL   | YES  | BTREE      |         |               |
| landing_pages_client_v3 |          1 | profile_id_2 |            2 | lp_id       | A         |     6069984 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Adders
  • 665
  • 8
  • 29

3 Answers3

3

In a Derived Table, get the maximum value of all_impressions for every landing_page. Join back to the main table to get the row corresponding to maximum all_impressions value.

We will eventually consider that row only if it belongs to Current Month. For sargability, we will not use functions on the dates column. Instead, we will determine the first day of the current month and next month. We will consider those dates which fall within this range. You can check details of the datetime functions here: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html

For performance, you may also need to add the following composite index: (landing_page, all_impressions, dates). (I am not sure about which order these columns should be in. Maybe some benchmarking/trial is needed.

SELECT
  t.* 
FROM 
  your_table AS t 
JOIN 
(
  SELECT
    landing_page, 
    MAX(all_impressions) AS max_all_impressions
  FROM your_table 
  GROUP BY landing_page
) AS dt 
  ON dt.landing_page = t.landing_page AND 
     dt.max_all_impressions = t.all_impressions 
WHERE 
 t.dates >= ((LAST_DAY(CURDATE()) + INTERVAL 1 DAY) - INTERVAL 1 MONTH) AND
 t.dates < (LAST_DAY(CURDATE()) + INTERVAL 1 DAY) 
Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
0

You can try like this way to select the landing_page url and maximum value of the all_impressions column. To do that you've to use WHERE clause to check that your dates column value is the same month and year as the CURRENT_DATE number. SEE Date and Time Functions

SELECT landing_page,MAX(all_impressions) 
 FROM your_table_name_goes_here
WHERE MONTH(dates) = MONTH(CURRENT_DATE())
 AND YEAR(dates) = YEAR(CURRENT_DATE())

OR

SELECT landing_page
  FROM your_table_name_goes_here
WHERE MONTH(dates) = MONTH(CURRENT_DATE())
  AND YEAR(dates) = YEAR(CURRENT_DATE())
ORDER BY all_impressions DESC LIMIT 1
A l w a y s S u n n y
  • 36,497
  • 8
  • 60
  • 103
0

In mysql. you can do like this.

SELECT landing_page,MAX(all_impressions) AS max_count
  FROM your_table_name_goes_here
    WHERE MONTH(dates) = MONTH(NOW()) AND YEAR(dates) = YEAR(NOW())
     GROUP BY landing_page ORDER BY max_count DESC LIMIT 1
  • thanks - this doesn't work as it returns the current month even if it's not the highest – Adders Nov 20 '18 at 21:46
  • set your required time period in where clause "MONTH(dates) = MONTH(NOW()) AND YEAR(dates) = YEAR(NOW())". here I use NOW() which set current Month. In this query first we get all the records of given time period, then group by landing_page on MAX aggregate AS max_count. then order by descending on max_count. then apply limit 1 which give only first record which is max – Syed Ausaf Hussain Nov 21 '18 at 11:03