1

Let say i have table

id|val
1 |1
2 |1
3 |2
4 |5
5 |2
6 |2
7 |2

How to get table like this:

2 | 1 |2
3 | 2 |1
4 | 5 |1
7 | 2 |3

I.e table in which third column is number of similar values in second column.

Sure i can do that using php or perl code, but i remmember it was posible by using sql variables only.

arheops
  • 15,544
  • 1
  • 21
  • 27
  • 4
    I have no idea what logic is used to get the second table. – Gordon Linoff Mar 04 '16 at 17:36
  • @GordonLinoff, First col state the last position where the value taken from. Second col state the number which will calculated, and the third col state the number of occurrence. – Murad Hasan Mar 04 '16 at 17:40
  • Look. Go via second column, counting number of values. Let say you counting at id=2. At id=2 you have in second column 1. If you get on next step (id=3) value not 2, you put one row(1,2) where 2 is count. And now count 2, get one 2,so put (2,1). Next one is (5,1), after that you have 3 of 2, so (2,3). Is that more clear now? – arheops Mar 04 '16 at 17:41
  • @arheops, do you need a query or other solution will be okey? – Murad Hasan Mar 04 '16 at 17:47
  • I need query. Other solution(like procedure or external language) i can do myself. But i am pretty sure query possible(i did it myself on university labs). – arheops Mar 04 '16 at 17:48
  • @arheops, your question is like: [selecting-groups-of-consecutive-records-with-a-common-attribute](http://stackoverflow.com/questions/21134737/selecting-groups-of-consecutive-records-with-a-common-attribute) – Murad Hasan Mar 04 '16 at 17:55
  • Yes, thank you, seams similar. Will try understand that magic. – arheops Mar 04 '16 at 17:58

2 Answers2

2

Oh, I think I figured it out. You care about sequences of values that are adjacent. The first column is the maximum id, the second is the value, and the third is the length.

Yes, you can do this with variables:

select max(id), val, count(*)
from (select t.*,
             (@grp := if(@v = val, @grp,
                         if(@v := val, @grp + 1, @grp + 1)
                        )
             ) as grp
      from yourtable t cross join
           (select @v := -1, @grp := -1) params
      order by id
     ) t
group by grp, val
order by max(id);
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Have you tested this one? It give for me same table+ third field which always 1. – arheops Mar 04 '16 at 17:46
  • I hope you don't mind, but I edited your solution to work correctly and provided a fiddle demo. This is the best solution imo. – sgeddes Mar 04 '16 at 17:54
  • Yes, this one seams ok. Is it have complexity O(n^2) or O(n)? – arheops Mar 04 '16 at 18:06
  • Based on testing it is O(n) and take 0.072sec for 10000 records on core 2 2700(default mysql config) – arheops Mar 04 '16 at 18:30
  • @sgeddes . . . In general I don't mind edits that correct an answer. In this case, the edit was not correct. The variable assignment and reference needs to be in the same expression, because MySQL does not guarantee the order of evaluation of expressions in a `select` statement. – Gordon Linoff Mar 04 '16 at 21:34
  • I see what you're doing now but I have never set the value of a `user-defined variable` in an `if` statement (just noticed your `:`). I have used the syntax I provided for years and have never had an issue with it, but perhaps there are cases where it's incorrect and not guaranteed. Thanks for the clarification (and for fixing your original post). – sgeddes Mar 04 '16 at 23:09
  • @sgeddes . . . Here is an example: http://stackoverflow.com/questions/16715504/mysql-define-a-variable-within-select-and-use-it-within-the-same-select. It is not only an issue with aggregation; I've seen similar problems with programmatic interfaces to MySQL. – Gordon Linoff Mar 05 '16 at 00:46
  • Thank you so much for the clarification. I guess I generally have used user defined variables as row numbers, not in groupings. I see what you're referring to now. Love learning something new! – sgeddes Mar 05 '16 at 01:48
  • @sgeddes . . . I'm not thrilled with the mechanism for handling variables in this situation. I've settled on this `if()` formulation, so when I see it, I know exactly what it is doing. – Gordon Linoff Mar 05 '16 at 21:52
0

A purist might have an issue with this - but I'm a purist, and I don't ...

SELECT MAX(id) max_id
     , val -- or MAX(val) if you like
     , COUNT(*) total
  FROM 
     ( SELECT id
            , val
            , CASE WHEN @prev <> val THEN @i:=@i+1 ELSE @i:=@i END i
            , @prev := val prev 
         FROM yourtable
            , (SELECT @prev:=null, @i:=1) vars 
        ORDER 
           BY id
     ) a
 GROUP 
    BY i;

http://www.sqlfiddle.com/#!9/32d395/16

Oh, it looks quite a lot like the other answer

Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • Yes, it is same except you have no explict cross join(but still have it). – arheops Mar 04 '16 at 18:17
  • It is not a question of being a purist. You are using variables and assigning them in *different* expressions. Unfortunately, MySQL does not guarantee the order of evaluation of expressions, and under some circumstances does *not* process them in lexical order (this is not just a theoretical concern, although the documentation does have a warning about it). So, one assigns variables and then uses them in another expression at his/her own risk. – Gordon Linoff Mar 04 '16 at 21:29
  • Yes. In this case, the inconsistency cannot arise. – Strawberry Mar 04 '16 at 21:45