0

I have a table called loop_msg with msg_id and content. I have a second table called loop_msg_status with channel and msg_id. This is used to post messages as round-robin in different channels, so I need to keep track of which msg_id has been posted last in each channel.

SELECT 
  a.msg_id, 
  b.content, 
  b.rank, 
  b.rank + 1, 
  c.rank, 
  c.content as next_content 
FROM 
  loop_msg_status as a 
  LEFT JOIN (
    SELECT 
      *, 
      RANK() OVER (
        ORDER BY 
          msg_id ASC
      ) as rank 
    FROM 
      loop_msg
  ) b ON a.msg_id = b.msg_id 
  LEFT JOIN (
    SELECT 
      *, 
      RANK() OVER (
        ORDER BY 
          msg_id ASC
      ) as rank 
    FROM 
      loop_msg
  ) c ON b.rank + 1 = c.rank

With this query I'm able to get the current msg_id and its content from every channel. I'm also getting that msg_id's rank from table loop_msg. I also get its rank+1 and get rank+1's content, if that makes sense. And it works. However, if rank is the highest one, then rank+1 doesn't exist and I get a NULL next_content. I would like in that case to SELECT the lowest rank from loop_msg which is 1 and get its content as next_content instead. Should I add a IF() and if so, where? Or is there a better way to do this?

SELECT version();
> 10.5.13-MariaDB

Full SQL example:

CREATE TABLE `loop_msg` (
  `msg_id` int(11) NOT NULL,
  `content` varchar(2000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `loop_msg` (`msg_id`, `content`) VALUES
(2, 'Content 2'),
(3, 'Content 3'),
(4, 'Content 4'),
(6, 'Content 6'),
(7, 'Content 7'),
(8, 'Content 8');

CREATE TABLE `loop_msg_status` (
  `channel` bigint(20) NOT NULL,
  `msg_id` int(11) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `loop_msg_status` (`channel`, `msg_id`) VALUES
(316757642527768577, 4),
(384071823261696010, 6),
(939746456632438804, 8);


ALTER TABLE `loop_msg`
  ADD PRIMARY KEY (`msg_id`);

ALTER TABLE `loop_msg_status`
  ADD PRIMARY KEY (`channel`);


ALTER TABLE `loop_msg`
  MODIFY `msg_id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=9;
COMMIT;

Expected result

channel            | nextContent
--------------------------------
316757642527768577 | Content 6
384071823261696010 | Content 7
939746456632438804 | Content 2
dan
  • 3,439
  • 17
  • 54
  • 82
  • 1
    CASE-WHEN-THEN clause on the column projection "next_content" should serve the requirement. Additionally , the lowest rank needs to be materialized somehwere so you can refer when the CASE statement evaluates to NULL. For usage of case statement ref: https://www.mysqltutorial.org/mysql-case-function/ – rajorshi Feb 13 '22 at 05:10
  • Please specify exactly which version of MariaDB you are using: `SELECT version();` and if you're still having trouble, add the exact SQL to the question. I've tested with various versions of MariaDB 10.3 to 10.6. – Jon Armstrong Feb 13 '22 at 18:52
  • @JonArmstrong 10.5.13-MariaDB – dan Feb 13 '22 at 19:00
  • That should be fine. The answer works well with that version of MariaDB. – Jon Armstrong Feb 13 '22 at 19:15

1 Answers1

1

Final SQL with the schema (and expected result) given in the question:

The final fiddle

WITH cte AS (
        SELECT loop_msg.msg_id
             , channel
             , COALESCE(
                           LEAD(content)        OVER w1
                         , FIRST_VALUE(content) OVER w1
                       ) AS content_next
          FROM      loop_msg
          LEFT JOIN loop_msg_status
            ON loop_msg.msg_id = loop_msg_status.msg_id
          WINDOW w1 AS (ORDER BY loop_msg.msg_id)
     )
SELECT channel, content_next
  FROM cte
 WHERE channel IS NOT NULL
;

The result:

+--------------------+--------------+
| channel            | content_next |
+--------------------+--------------+
| 316757642527768577 | Content 6    |
| 384071823261696010 | Content 7    |
| 939746456632438804 | Content 2    |
+--------------------+--------------+

To also see the current and next msg_id, here's the adjusted SQL:

The adjusted fiddle

WITH cte AS (
        SELECT loop_msg.msg_id
             , channel
             , ROW_NUMBER() OVER w1 as rankx
             , COALESCE(
                           LEAD(loop_msg.msg_id)        OVER w1
                         , FIRST_VALUE(loop_msg.msg_id) OVER w1
                       ) AS msgid_next
             , COALESCE(
                           LEAD(content)        OVER w1
                         , FIRST_VALUE(content) OVER w1
                       ) AS content_next
          FROM      loop_msg
          LEFT JOIN loop_msg_status
            ON loop_msg.msg_id = loop_msg_status.msg_id
          WINDOW w1 AS (ORDER BY loop_msg.msg_id)
     )
SELECT channel, content_next, msg_id, msgid_next
  FROM cte
 WHERE channel IS NOT NULL
;

The result:

+--------------------+--------------+--------+------------+
| channel            | content_next | msg_id | msgid_next |
+--------------------+--------------+--------+------------+
| 316757642527768577 | Content 6    |      4 |          6 |
| 384071823261696010 | Content 7    |      6 |          7 |
| 939746456632438804 | Content 2    |      8 |          2 |
+--------------------+--------------+--------+------------+

More detail:

You could try something like this. In this case, we can use ROW_NUMBER instead of RANK, since msg_id is unique and no two messages will have the same msg_id which would have the same RANK. Feel free to replace with RANK, if you wish.

The fiddle

Later, we can partition by channel to do this for each channel separately. Your question wasn't entirely clear about how you wanted to use channel.

SELECT *
     , ROW_NUMBER() OVER (ORDER BY msg_id ASC) as rankx
     , COALESCE(
                   LEAD(msg_id)        OVER (ORDER BY msg_id ASC)
                 , FIRST_VALUE(msg_id) OVER (ORDER BY msg_id ASC)
               ) AS msgid_next
     , COALESCE(
                   LEAD(content)        OVER (ORDER BY msg_id ASC)
                 , FIRST_VALUE(content) OVER (ORDER BY msg_id ASC)
               ) AS content_next
  FROM loop_msg
;

The result:

+--------+---------+-------+------------+--------------+
| msg_id | content | rankx | msgid_next | content_next |
+--------+---------+-------+------------+--------------+
|      1 | c1      |     1 |          2 | c2           |
|      2 | c2      |     2 |          3 | c3           |
|      3 | c3      |     3 |          4 | c4           |
|      4 | c4      |     4 |          5 | c5           |
|      5 | c5      |     5 |          6 | c6           |
|      6 | c6      |     6 |          7 | c7           |
|      7 | c7      |     7 |          1 | c1           |
+--------+---------+-------+------------+--------------+

The setup:

CREATE TABLE loop_msg (
     msg_id  int  auto_increment primary key
   , content varchar(20)
);

INSERT INTO loop_msg (content) VALUES
   ('c1'), ('c2'), ('c3'), ('c4'), ('c5'), ('c6'), ('c7')
;

Test cast #2, processing per channel:

CREATE TABLE loop_msg (
      msg_id  int  auto_increment primary key
    , chan    varchar(20)
    , content varchar(20)
);

INSERT INTO loop_msg (content, chan) VALUES
   ('c1', 'chan1')
 , ('c2', 'chan1')
 , ('c3', 'chan1')
 , ('c4', 'chan1')
 , ('c5', 'chan1')
 , ('c6', 'chan1')
 , ('c7', 'chan1')
 , ('d2', 'chan2')
 , ('d3', 'chan2')
 , ('d4', 'chan2')
 , ('d5', 'chan2')
 , ('d6', 'chan2')
 , ('d7', 'chan2')
 , ('d8', 'chan2')
;

SELECT *
     , ROW_NUMBER() OVER (PARTITION BY chan ORDER BY msg_id ASC) as rankx
     , COALESCE(
                   LEAD(msg_id)        OVER (PARTITION BY chan ORDER BY msg_id)
                 , FIRST_VALUE(msg_id) OVER (PARTITION BY chan ORDER BY msg_id)
               ) AS msgid_next
     , COALESCE(
                   LEAD(content)        OVER (PARTITION BY chan ORDER BY msg_id)
                 , FIRST_VALUE(content) OVER (PARTITION BY chan ORDER BY msg_id)
               ) AS content_next
  FROM loop_msg
;

The result:

+--------+-------+---------+-------+------------+--------------+
| msg_id | chan  | content | rankx | msgid_next | content_next |
+--------+-------+---------+-------+------------+--------------+
|      1 | chan1 | c1      |     1 |          2 | c2           |
|      2 | chan1 | c2      |     2 |          3 | c3           |
|      3 | chan1 | c3      |     3 |          4 | c4           |
|      4 | chan1 | c4      |     4 |          5 | c5           |
|      5 | chan1 | c5      |     5 |          6 | c6           |
|      6 | chan1 | c6      |     6 |          7 | c7           |
|      7 | chan1 | c7      |     7 |          1 | c1           |
|      8 | chan2 | d2      |     1 |          9 | d3           |
|      9 | chan2 | d3      |     2 |         10 | d4           |
|     10 | chan2 | d4      |     3 |         11 | d5           |
|     11 | chan2 | d5      |     4 |         12 | d6           |
|     12 | chan2 | d6      |     5 |         13 | d7           |
|     13 | chan2 | d7      |     6 |         14 | d8           |
|     14 | chan2 | d8      |     7 |          8 | d2           |
+--------+-------+---------+-------+------------+--------------+

Finally:

We can also define a window clause to avoid rewriting the specification each time:

SELECT *
     , ROW_NUMBER() OVER w1 as rankx
     , COALESCE(
                   LEAD(msg_id)        OVER w1
                 , FIRST_VALUE(msg_id) OVER w1
               ) AS msgid_next
     , COALESCE(
                   LEAD(content)        OVER w1
                 , FIRST_VALUE(content) OVER w1
               ) AS content_next
  FROM loop_msg
  WINDOW w1 AS (PARTITION BY chan ORDER BY msg_id)
;

Result:

+--------+-------+---------+-------+------------+--------------+
| msg_id | chan  | content | rankx | msgid_next | content_next |
+--------+-------+---------+-------+------------+--------------+
|      1 | chan1 | c1      |     1 |          2 | c2           |
|      2 | chan1 | c2      |     2 |          3 | c3           |
|      3 | chan1 | c3      |     3 |          4 | c4           |
|      4 | chan1 | c4      |     4 |          5 | c5           |
|      5 | chan1 | c5      |     5 |          6 | c6           |
|      6 | chan1 | c6      |     6 |          7 | c7           |
|      7 | chan1 | c7      |     7 |          1 | c1           |
|      8 | chan2 | d2      |     1 |          9 | d3           |
|      9 | chan2 | d3      |     2 |         10 | d4           |
|     10 | chan2 | d4      |     3 |         11 | d5           |
|     11 | chan2 | d5      |     4 |         12 | d6           |
|     12 | chan2 | d6      |     5 |         13 | d7           |
|     13 | chan2 | d7      |     6 |         14 | d8           |
|     14 | chan2 | d8      |     7 |          8 | d2           |
+--------+-------+---------+-------+------------+--------------+
Jon Armstrong
  • 4,654
  • 2
  • 12
  • 14
  • MariaDB doesn't seem to recognize `COALESCE()`. I have `mariadb Ver 15.1 Distrib 10.5.13-MariaDB`. – dan Feb 13 '22 at 18:45
  • @dan MariaDB handles COALESCE just fine. – Jon Armstrong Feb 13 '22 at 18:48
  • @dan Check the fiddle. MariaDB 10.5 and 10.6 handle this query just fine: https://dbfiddle.uk/?rdbms=mariadb_10.5&fiddle=e29e52e45b5fb3019b51b810bf1dd188 If you're still having trouble, ask a new question or add that detail to this question. – Jon Armstrong Feb 13 '22 at 18:49
  • Weird. I have the following error: `Unrecognized keyword. (near "COALESCE" at position 92)`. I will do some further testings. – dan Feb 13 '22 at 18:55
  • @dan Post the result of: `SELECT version();` ... It's likely you have a mistake prior to the `COALESCE` keyword. I'll need to see the exact SQL to help resolve that. – Jon Armstrong Feb 13 '22 at 18:56
  • Thanks for your help, I will try to work it out and will report back later. – dan Feb 13 '22 at 18:58
  • The issue just seems to be about the lack of a `JOIN` since I have 2 tables here, one with the `channel`s and the latest `msg_id` posted, and the other one with each `msg_id`s and its `content`. I'll try to figure this out. – dan Feb 13 '22 at 18:59
  • @dan I see. Since your original SQL didn't really use the loop_msg_status table for anything other than msg_id, I chose to remove that from the example, since it didn't add value to the answer. The simple SQL is clearer. This can certainly be used in the context of JOINs too. – Jon Armstrong Feb 13 '22 at 19:07
  • It can't work with 1 table though, I have 30 messages in roudrobin in 20 different channels. Messages are sent by that bot every half hour if at least 10 other messages were posted by non-bot users during that half hour so every channel is at a different place/pace, so I really need a table with all my messages that I can add/edit and another one for every channel I want the message to be posted in and where we currently are in the round robin loop. – dan Feb 13 '22 at 20:25
  • @dan The concept / logic is the same, no matter how you generate the rows, from one table or by several joins, as long as you understand the relationships and aggregation logic. – Jon Armstrong Feb 13 '22 at 21:47
  • @dan The result of the `FROM` clause, no matter how many JOINs, is a set of rows, just as though they were obtained from one table. That's the way VIEWs, derived tables and CTE terms are seen by the rest of the query logic. I can show you the same test case with more than one table, if you wish. – Jon Armstrong Feb 13 '22 at 21:48
  • @dan Please add the CREATE TABLE statements to your question and enough INSERT statements for testing, plus the expected result. – Jon Armstrong Feb 13 '22 at 21:52
  • I added the full SQL statements and the expected results. – dan Feb 14 '22 at 00:10
  • @dan I've adjusted the answer. The first SQL in the answer has the final solution using your latest detail and expected result. – Jon Armstrong Feb 14 '22 at 03:06
  • I can confirm this is working. I'll toy a little bit more with it to properly understand. I accepted your answer. – dan Feb 14 '22 at 04:43
  • What makes `content` available in w1 while `msg_id` is not? Trying to figure out by myself how to also get a `msg_id_next` (to override it after posting the new message) but when doing `COALESCE(LEAD(msg_id) OVER w1, FIRST_VALUE(msg_id) OVER w1 ) AS msg_id_next` it seems like `msg_id` is just not available. – dan Feb 14 '22 at 05:10
  • @dan You should have added all the detail you required to your expected result. I removed some of my original detail because you didn't want to see it in the result. – Jon Armstrong Feb 14 '22 at 05:16
  • I'm trying to learn at the same time, not just to be spoonfed. There are other colums in the tables as well and I had no issue selecting them and adding my other WHERE clauses, it's really just getting the next `msg_id` that's still bothering me so far. – dan Feb 14 '22 at 05:34
  • I see I was close, `loop_msg.msg_id` instead of `msg_id` in the new `COALESCE()`. – dan Feb 14 '22 at 05:36
  • I noticed it stops working if 2 `channel`s are processing the same `msg_id`. In that case the `msgid_next` is wrong for the first channel (outputting the same `msg_id` as the current one). I can add more INSERTS into my OP if needed. Not sure the `ROW_NUMBER() OVER w1 as rankx` is needed either (I tested with and without). – dan Feb 14 '22 at 05:46
  • @dan The logic depends on your model having a usable ordering of messages / rows. Without that, there's no solution. You may need to ask a new question. Your current question infers msg_id is the only source of order, with no way to break ties with a secondary ordering term. If you can provide that secondary ordering term, which guarantees a unique way to order rows, the solution is trivial. – Jon Armstrong Feb 14 '22 at 05:58
  • @dan Can you use channel as the secondary ordering term? That would allow ordering the joined rows with something like `ORDER BY msg_id, channel` – Jon Armstrong Feb 14 '22 at 06:14
  • I tried playing with your Fiddle and adding `channel` into the `ORDER BY` clause as well but the result is still wrong as we can see (`content_next` is wrong in the basic example as well there): https://dbfiddle.uk/?rdbms=mariadb_10.5&fiddle=c80e7cd691ce3b0b2a2c6b7bcebcf25f – dan Feb 14 '22 at 23:30
  • @dan To add channel logic, you need to first explain the requirement, show some data and then show the exact expected result, given that data. The only expected result you've shown is without channel logic. I have no idea how you want to use channel. You might want to ask a new question, since the original question was answered completely and correctly, based on the requirement and expected result. – Jon Armstrong Feb 14 '22 at 23:48
  • I posted a proper followup with better formatting as well https://stackoverflow.com/questions/71120204/issues-selecting-rows-with-next-rows-ids – dan Feb 15 '22 at 01:38