4

I have a simple table and I need to identified groups of four rows (the groups aren't consecutives), but each rows of each row has a +1 in the value. For example:

----------------------
| language     | id  |
----------------------
| C            |  16 |
| C++          |  17 |
| Java         |  18 |
| Python       |  19 |
| HTML         |  65 |
| JavaScript   |  66 |
| PHP          |  67 |
| Perl         |  68 |
----------------------

I want to add a column that indicates the group or set, how is possible to get this output using MySQL?:

----------------------------
| language     | id  | set |
----------------------------
| C            |  16 |  1  |
| C++          |  17 |  1  |
| Java         |  18 |  1  |
| Python       |  19 |  1  |
| HTML         |  65 |  2  |
| JavaScript   |  66 |  2  |
| PHP          |  67 |  2  |
| Perl         |  68 |  2  |
----------------------------

Note that in this examples is only 2 sets (it could be 1 or more sets) and they didn't start in 16 (such values are not knowledged, but the restriction is that each id value of each row has this form n, n+1, n+2 and n+3).

I've been investigating about Gaps & Islands problem but didn't figure how to solve it by using their solutions. Also I search on stackoverflow but the closest question that I found was How to find gaps in sequential numbering in mysql?

Thanks

Community
  • 1
  • 1
PlainOldProgrammer
  • 2,725
  • 5
  • 22
  • 30
  • Do you need to perform this query on an ongoing basis, or is this a one-time query that you are going to use to populate `set` values? – Mike Brant Aug 04 '15 at 16:44
  • This query should be used multiple times with dynamic data (the values of the groups of 4 rows can change but such groups aren't cosnecutively, i.e. there aren't 2 groups that cover from 1 to 8 id values). – PlainOldProgrammer Aug 04 '15 at 16:47

3 Answers3

2
SELECT language,id,g
FROM (
  SELECT language,id,
    CASE WHEN id=@lastid+1 THEN @n ELSE @n:=@n+1 END AS g,
    @lastid := id As b
  FROM
    t, (SELECT @n:=0) r
  ORDER BY
    id
) s

EDIT

In case you want just 4 per group add a row number variable:

SELECT language,id,g,rn
FROM (
  SELECT language,id,
    CASE WHEN id=@lastid+1 THEN @n ELSE @n:=@n+1 END AS g,
   @rn := IF(@lastid+1 = id, @rn + 1, 1) AS rn,
    @lastid := id As dt
  FROM
    t, (SELECT @n:=0) r
  ORDER BY
    id
) s
Where rn <=4

FIDDLE

Mihai
  • 26,325
  • 7
  • 66
  • 81
  • How does this query identify groups of *4* consecutive records? – Giorgos Betsos Aug 04 '15 at 16:49
  • @GiorgosBetsos The same it identifies 3 or 2 or 10 consecutive records.It checks it the the next record has the id +1 of the previous one. – Mihai Aug 04 '15 at 16:52
  • I think the OP wants *only* groups of *4* consecutive records returned and everything else filtered out. – Giorgos Betsos Aug 04 '15 at 16:58
  • @GiorgosBetsos I think that is just the sample data,but it that is the case its easy to fix with a row number per group. – Mihai Aug 04 '15 at 16:59
1
select language, 
       @n:=if(@m+1=id, @n, @n+1) `set`, 
       (@m:=id) id 
   from t1, 
        (select @n:=0) n, 
        (select @m:=0) m

Demo on sqlfiddle

splash58
  • 26,043
  • 3
  • 22
  • 34
0

You can use the following query:

SELECT l.*, s.rn
FROM languages AS l
INNER JOIN (
  SELECT minID, @rn2:=@rn2+1 AS rn
  FROM (
    SELECT MIN(id) AS minID
    FROM (
      SELECT id,
             id - IF (true, @rn1:=@rn1+1, 0) AS grp
      FROM languages
      CROSS JOIN (SELECT @rn1:=0) AS var1 
      ORDER BY id) t    
    GROUP BY grp
    HAVING COUNT(grp) = 4 ) u
  CROSS JOIN (SELECT @rn2:=0) AS var2  
) s ON l.id BETWEEN minID AND minID + 3

The above query identifies islands of exactly 4 consecutive records and returns there records only. It is easily modifiable to account for a different number of consecutive records.

Please also note the usage of IF conditional: it guarantees that @rn1 is first initialized and then used in order to calculate grp field.

Demo here

Giorgos Betsos
  • 71,379
  • 9
  • 63
  • 98