0

I have this table columns structure:

id - n1 - n2 - n3

And here it is with some dummy data:

id - n1 - n2 - n3
1 - 3 - 2 - 1
2 - 6 - 5 - 7
3 - 2 - 3 - 1
4 - 1 - 6 - 5
5 - 5 - 6 - 7
6 - 3 - 5 - 6

And the idea is to Select and count each unique distinct group of n1, n2 and n3 in sequence.

So, for example, we could get this result:

total - n1s - n2s - n3s
2 - 1 - 2 - 3
2 - 5 - 6 - 7
1 - 1 - 5 - 6
1 - 3 - 5 - 6

Can you help me set the state to achieve that??

I am trying to attempt that without multiple selects and PHP array sorting...

Thanks.

  • Your sample data and expected output are not matching. There is no row corresponding to sequence of 1,2,3 but your expected output has it – Madhur Bhaiya Oct 09 '18 at 15:00
  • While the question is rather obscure, this kind of problem can be symptomatic of poor design – Strawberry Oct 09 '18 at 15:03
  • @MadhurBhaiya There is, id 1 and id 3 both have 1, 2, 3. That's my point. I want to be able to count based on the sorted values from each numeric column... Got it? Thanks. – user4561667 Oct 09 '18 at 22:06
  • @Strawberry I don't have a column saving all the numbers in sequence. Is that what you are suggesting as poor design? That's why I want to be able to count all rows that match all numbers when they are in sequence... Can you help? Thanks. – user4561667 Oct 09 '18 at 22:10
  • @user4561667 got the point. You should really do the Strawberry's [suggestion](https://stackoverflow.com/a/52724624/2469308) to normalize the table. But still I would like to give it a try. It seems that you have more than just 3 columns of n's. How many columns are expected. Are we sure that within them (in a row), values would be unique or duplicates expected ? – Madhur Bhaiya Oct 10 '18 at 04:18
  • Okay, to be more clear my table has all lottery numbers. I want to count all unique numbers draws regardless of the sequence. I store the picked up sequence order. In a 6 number lottery I may have this result: "12, 22, 27, 34, 45, 51" so I store them in picking order n1=22, n2=45, n3=12, n4=51, n5=27 and n6=34. So I want to select all entries where "12, 22, 27, 34, 45, 51" appears and in that case, select that row even if the sequence of numbers is not the same. Makes sense? I could add a field just to include them in sequence numbers but that means I need to update all my past data. Thanks. – user4561667 Oct 10 '18 at 11:24
  • @user4561667 check the posted answer. Let me know if it works! – Madhur Bhaiya Oct 10 '18 at 12:28
  • @MadhurBhaiya Thanks, I will look into having the bespoke function. But if I want to avoid that. To create a relational table with all the foreign keys for my unique in sequence number entries do you have an idea on how I could update all previous data using SQL Command? Or should I do it via PHP or something similar? Thank you. – user4561667 Oct 10 '18 at 14:03
  • @user4561667 It surely can be done completely using SQL. Look into `Insert .. Select` syntax – Madhur Bhaiya Oct 10 '18 at 14:04

3 Answers3

1

Consider the following - a normalised dataset...

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(id INT NOT NULL
,n INT NOT NULL
,val INT NOT NULL
,PRIMARY KEY(id,n)
);

INSERT INTO my_table VALUES
(1, 1, 3),
(1, 2, 2),
(1, 3, 1),
(2, 1, 6),
(2, 2, 5),
(2, 3, 7),
(3, 1, 2),
(3, 2, 3),
(3, 3, 1),
(4, 1, 1),
(4, 2, 6),
(4, 3, 5),
(5, 1, 5),
(5, 2, 6),
(5, 3, 7),
(6, 1, 3),
(6, 2, 5),
(6, 3, 6);

Here's a quick (to write) and dirty solution. Faster / more elegant solutions are available...

SELECT vals
     , COUNT(*) total
  FROM 
     ( SELECT id
            , GROUP_CONCAT(val ORDER BY val) vals 
         FROM my_table 
        GROUP 
           BY id
     ) x 
 GROUP 
    BY vals;
+-------+-------+
| vals  | total |
+-------+-------+
| 1,2,3 |     2 |
| 1,5,6 |     1 |
| 3,5,6 |     1 |
| 5,6,7 |     2 |
+-------+-------+
Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • I don't think you understood my question... Your table structure does not match mine... I want to count rows that match the same numbers in columns when they are sorted. Thanks. – user4561667 Oct 09 '18 at 22:08
  • I don't think you understand my answer. You should amend your (non-normalised) structure to match my (normalised) one – Strawberry Oct 09 '18 at 22:10
  • Sorry, I don't think I get it. Can you point me to the right location to learn about normalised vs non-normalised structures? All I want is to collect rows and count when the same numbers appear in multiple columns, regarding of the column. – user4561667 Oct 09 '18 at 22:15
  • Are you suggesting a relational table? Using foreign keys? It would be easier just to create a sorted column and include all numbers in sequence there and then select by that sorted column... Or am I missing something? That's what I am trying to avoid, so I want to select using the columns I have but count by unique distinct when in sequence. Thanks. – user4561667 Oct 09 '18 at 22:18
  • I'm suggesting the structure I've described. If the same value, representing essentially the same kind of thing, can appear arbitrarily in different columns, then this is highly symptomatic of poor design. Similarly, any time you find yourself with enumerated columns above, say, 2), alarm bells should start ringing. – Strawberry Oct 10 '18 at 03:12
  • I am considering creating a relational table... Or add a column with the sorted in sequence numbers concatenated... – user4561667 Oct 10 '18 at 19:51
1

We just need expressions to "sort" the values in columns n1, n2 and n3. If we have that, then we can do a simple GROUP BY and COUNT.

SELECT COUNT(1) AS total
     , IF(t.n1<=t.n2,IF(t.n1<=t.n3,t.n1,t.n3),IF(t.n2<=t.n3,t.n2,t.n3)) AS n1s
     , IF(t.n1<=t.n2,IF(t.n2<=t.n3,t.n2,IF(t.n1<=t.n3,t.n3,t.n1)),IF(t.n1<=t.n3,t.n1,IF(t.n2<=t.n3,t.n3,t.n2 ))) AS n2s
     , IF(t.n1<=t.n2,IF(t.n2<=t.n3,t.n3,t.n2),IF(t.n1<=t.n3,t.n3,t.n1)) AS n3s
  FROM this_table_column_structure t
 GROUP BY n1s,n2s,n3s
 ORDER BY total DESC, n1s, n2s, n3s

will return

total   n1s   n2s   n3s
-----  ----  ----  ----
    2     1     2     3
    2     5     6     7
    1     1     5     6
    1     3     5     6
spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • Hi, can you explain the IFs? do I need to do long IFs for every numeric column I have in my table? Thanks. – user4561667 Oct 09 '18 at 22:08
  • With three columns, n1, n2 and n3, we can use conditional tests to "sort" the values. If `n1 < n2 and n1 < n3`, then n1 is the lowest. If `n2 < n1 and n2 < n3`, then n2 is the lowest. With three columns, there's a a total of eight possible combinations. The expressions in the SQL are just examples, one possible way to get the values sorted, returning the lowest value as col `n1s`, the second lowest as `n2s`, the highest as `n3s`. – spencer7593 Oct 10 '18 at 14:24
  • MySQL `IF()` function is documented here https://dev.mysql.com/doc/refman/5.7/en/control-flow-functions.html#function_if a more portable ANSI standards compliant approach would be to use `CASE` expressions in place of the MySQL `IF()` functions. – spencer7593 Oct 10 '18 at 14:27
  • This answer was based on there being *three* columns to be sorted, as shown in the question, With four, five or more columns, then obviously sorting by using expressions doing comparisons becomes unwieldy. In that case, I would consider storing additional columns in the table that identifies the ordered "sequence" of the values, to make the extract simpler. – spencer7593 Oct 11 '18 at 16:24
1

As a first approach (if time permits), you should really consider normalizing your table, as suggested in @Strawberry's answer

However, a second approach allowing any number of columns (although inefficient due to String operations and Bubble Sorting) is possible, utilizing User Defined Functions.

We basically need to create a function, which can sort the values inside a comma separated string. I found a working function, which can do the sorting. Reproducing code from here:

-- sort comma separated substrings with unoptimized bubble sort
DROP FUNCTION IF EXISTS sortString;
DELIMITER |
CREATE FUNCTION sortString(inString TEXT) RETURNS TEXT
BEGIN
  DECLARE delim CHAR(1) DEFAULT ','; -- delimiter 
  DECLARE strings INT DEFAULT 0;     -- number of substrings
  DECLARE forward INT DEFAULT 1;     -- index for traverse forward thru substrings
  DECLARE backward INT;   -- index for traverse backward thru substrings, position in calc. substrings
  DECLARE remain TEXT;               -- work area for calc. no of substrings
-- swap areas TEXT for string compare, INT for numeric compare
  DECLARE swap1 TEXT;                 -- left substring to swap
  DECLARE swap2 TEXT;                 -- right substring to swap
  SET remain = inString;
  SET backward = LOCATE(delim, remain);
  WHILE backward != 0 DO
    SET strings = strings + 1;
    SET backward = LOCATE(delim, remain);
    SET remain = SUBSTRING(remain, backward+1);
  END WHILE;
  IF strings < 2 THEN RETURN inString; END IF;
  REPEAT
    SET backward = strings;
    REPEAT
      SET swap1 = SUBSTRING_INDEX(SUBSTRING_INDEX(inString,delim,backward-1),delim,-1);
      SET swap2 = SUBSTRING_INDEX(SUBSTRING_INDEX(inString,delim,backward),delim,-1);
      IF  swap1 > swap2 THEN
        SET inString = TRIM(BOTH delim FROM CONCAT_WS(delim
        ,SUBSTRING_INDEX(inString,delim,backward-2)
        ,swap2,swap1
        ,SUBSTRING_INDEX(inString,delim,(backward-strings))));
      END IF;
      SET backward = backward - 1;
    UNTIL backward < 2 END REPEAT;
    SET forward = forward +1;
  UNTIL forward + 1 > strings
  END REPEAT;
RETURN inString;
END |
DELIMITER ;

You will need to run this code on your MySQL server, so that this function is available within a query, just like native built-in MySQL functions. Now, the querying part becomes simple. All you need to do is Concat_ws() all the number columns using comma. And, then apply sortString() function on the concatenated string. Eventually, use the "ordered" string in Group By clause, to get the desired result.

Try:

SELECT sortString(CONCAT_WS(',', n1, n2, n3)) AS n_sequence -- add more columns here
       COUNT(id) AS total 
FROM your_table 
GROUP BY n_sequence 
ORDER BY total DESC 

Now I suggest that you can use your application code to change comma separated n_sequence back to tabular column display.

Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57