I want to concatenate the surrounding rows(in the following examples only the surrounding 2 rows) after ranking to a new column(group by
seems to not work), and here is the data I have:
Schema (MySQL v8.0)
CREATE TABLE log_table (
`user_id` VARCHAR(5),
`date_time` DATETIME,
`event_name` VARCHAR(10),
`trivial` int
);
INSERT INTO log_table
(`user_id`, `date_time`, `event_name`, `trivial`)
VALUES
('001', '2020-12-10 10:00:02', 'c', 3),
('001', '2020-12-10 10:00:01', 'b', 9),
('001', '2020-12-10 10:00:40', 'e', 2),
('001', '2020-12-10 10:00:20', 'd', 6),
('001', '2020-12-10 10:00:00', 'a', 1),
('002', '2020-12-10 10:00:10', 'C', 9),
('002', '2020-12-10 10:00:50', 'D', 0),
('002', '2020-12-10 10:00:02', 'A', 2),
('002', '2020-12-10 10:00:09', 'B', 4);
To illustrate what I want to do. I can do summing over numerical values using the sum
clause as follows:
Query #1
SELECT *,
SUM(trivial)
over(
PARTITION BY user_id
ORDER BY user_id, date_time ROWS BETWEEN 2 preceding AND 2 following)
AS
trivial_new
FROM log_table;
user_id | date_time | event_name | trivial | trivial_new |
---|---|---|---|---|
001 | 2020-12-10 10:00:00 | a | 1 | 13 |
001 | 2020-12-10 10:00:01 | b | 9 | 19 |
001 | 2020-12-10 10:00:02 | c | 3 | 21 |
001 | 2020-12-10 10:00:20 | d | 6 | 20 |
001 | 2020-12-10 10:00:40 | e | 2 | 11 |
002 | 2020-12-10 10:00:02 | A | 2 | 15 |
002 | 2020-12-10 10:00:09 | B | 4 | 15 |
002 | 2020-12-10 10:00:10 | C | 9 | 15 |
002 | 2020-12-10 10:00:50 | D | 0 | 13 |
For the string field event_name
, I tried this snippet:
Query #2
SELECT *,
Concat(event_name)
over(
PARTITION BY user_id
ORDER BY user_id, date_time ROWS BETWEEN 2 preceding AND 2 following)
AS
event_name_new
FROM log_table
And here is my expected results:
user_id | date_time | event_name | trivial | event_name_new |
---|---|---|---|---|
001 | 2020-12-10 10:00:00 | a | 1 | abc |
001 | 2020-12-10 10:00:01 | b | 9 | abcd |
001 | 2020-12-10 10:00:02 | c | 3 | abcde |
001 | 2020-12-10 10:00:20 | d | 6 | bcde |
001 | 2020-12-10 10:00:40 | e | 2 | cde |
002 | 2020-12-10 10:00:02 | A | 2 | ABC |
002 | 2020-12-10 10:00:09 | B | 4 | ABCD |
002 | 2020-12-10 10:00:10 | C | 9 | ABCD |
002 | 2020-12-10 10:00:50 | D | 0 | BCD |
But the Query #2 cannot get me here, and I have googled but all I can find is about group by
(refer to this and this and this).
I know I can work around the problem by using LAG
and LEAD
(for the following rows) but I need to concatenate the new columns and when I need to concatenate many rows I need to do lots of manual work like concatenate them by separators like ,
and etc.
Can I do that in one step without using LAG
and LEAD
?