1

Is there a way to GROUP BY a part of a string....

I wanted to create a SQLFIDDLE but they seem to have to serverproblems, so I have to make it visible here....

This would be the data...

CREATE TABLE tblArticle
    (
     id int auto_increment primary key, 
     art_id varchar(20), 
     art_name varchar(30),
     art_color varchar(30),
     art_type varchar(5)
    );

INSERT INTO tblArticle
(art_id, art_name, art_color, art_type)
VALUES
('12345-1','Textile', 'Black','MAT'),
('12345-2','Textile', 'Red','MAT'),
('12345-3','Textile', 'Green','MAT'),
('12345-4','Textile', 'Blue','MAT'),
('54321-1','Textile', 'Black','MAT'),
('54321-2','Textile', 'Red','MAT'),
('54321-3','Textile', 'Green','MAT'),
('54321-4','Textile', 'Blue','MAT');

So I get some like:

| id        | art_id   | art_name | art_color | art_type |
----------------------------------------------------------
| 1         | 12345-1  | Textile  | Black     | MAT      |
| 2         | 12345-2  | Textile  | Red       | MAT      |
| 3         | 12345-3  | Textile  | Green     | MAT      |
| 4         | 12345-4  | Textile  | Blue      | MAT      |
| 5         | 54321-1  | Textile  | Black     | MAT      |
| 6         | 54321-2  | Textile  | Red       | MAT      |
| 7         | 54321-3  | Textile  | Green     | MAT      |
| 8         | 54321-4  | Textile  | Blue      | MAT      |
| 9         | 9876543  | Textile  | White     | MAT      |
----------------------------------------------------------

My select looks like

Select art_id, art_name FROM tblArticle WHERE art_type = 'MAT' GROUP BY art_name

What I need is the art_id (doesn't matters if its with -1 or -2 and so on) and the art_name to do further querys.

As you can see I have 2 different art_id groups... and I want to group by them.

So I get two groups... 12345 and 54321. But I don't even know how to start ^^

Expacted result:

12345,   Textile
54321,   Textile
9876543, Textile

I tried to add art_id to my group by but the has the same effect like don't use group by ^^

What could I do to achieve this ?

SOLVED like:

SELECT DISTINCT @IF( @SCAN( art_id, '-' ) +1, 
                     @LEFT( art_id, @SCAN( art_id, '-')),
                     art_id) AS art_id, art_name
FROM                 tblArticle
WHERE                art_type LIKE '%MAT%';

In this case the DISTINCT has the same effect like a GROUP BY.

+1 is used to get 0 if scan could not find anything. Actaully it returns -1 if there was no found. But IF needs 0 (false) or 1+ (true). And there will never be a - in fist place in my case.

Could not use GROUP BY because it only accepts integer or column.

Dwza
  • 6,494
  • 6
  • 41
  • 73
  • 1
    Have you tried grouping by Left(art_id,5)? Or will the art_id sometimes be more than 5 characters before the "-" ? – Christian Barron Mar 06 '15 at 11:00
  • @ChristianBarron this sounds nice in first place but the left part is not 5 signs long all the time.. sometimes its 8 or 10 or 6 or what ever :) actually sometimes it has a `-` and sometimes noch... so group by with no `-` is fine... but if the art_id has a `-` than it has to "cut it off" – Dwza Mar 06 '15 at 11:02
  • So you want the last value of each group ? – Abhik Chakraborty Mar 06 '15 at 11:03
  • 1
    Any query you'll use it will run very slow because it cannot use indexes and it has to do string operations. You better split the column `art_id` into two pieces or add another column that contains only the part of `art_id` you need for this grouping and put an index on it. – axiac Mar 06 '15 at 11:22

4 Answers4

2

You could try this, It will work out where the dash is and group by properly

Select case when Instr(art_id, '-') = 0 
              then art_id 
              else Left(art_id, Instr(art_id, '-') - 1) end, art_name 
FROM tblArticle 
WHERE art_type = 'MAT' 
GROUP BY case when Instr(art_id, '-') = 0 
              then art_id 
              else Left(art_id, Instr(art_id, '-') - 1) end
,art_name
Christian Barron
  • 2,695
  • 1
  • 14
  • 22
1

In order to group, you must specify which bit of the string to group by. In this case it's the first 5 characters so you would use LEFT(art_id,5). This must be matched in the SELECT statement so you will need to modify it to read the same. I have aliased the column to be art_id otherwise it will be unknown:

SELECT       LEFT(art_id,5) AS art_id, art_name 
FROM         tblArticle 
WHERE        art_type = 'MAT' 
GROUP BY     LEFT(art_id,5), art_name

The only thing that could be a problem is if you begin to have IDs bigger than 5 characters. In this case you will need to use @FIND in order to look for the dash and take the left of that. This will crash if there's no dash, as the @FIND function returns -1 if no match is found, so we have to use an @IF statement to compensate for this.

In which case I would write:

SELECT       @IF(art_id LIKE '%-%'
                 ,@LEFT(art_id, @FIND('-', art_id, 0) - 1)
                 ,art_id
                 ) AS art_id, art_name 
FROM         tblArticle 
WHERE        art_type = 'MAT' 
GROUP BY     @IF(art_id LIKE '%-%'
                 ,@LEFT(art_id, @FIND('-', art_id, 0) - 1)
                 ,art_id
                 ), art_name

The @ symbols are necessary (at least I think they are, try without if it doesn't work), I've not used SQLBase before so I'm using the following official documentation as a guide:

GUPTA SQLBase - SQL Language Reference

twoleggedhorse
  • 4,938
  • 4
  • 23
  • 38
  • this won't work... please see my comments and also the updated "Exprected result" since the art_id is variable in length and sometimes has no `-` – Dwza Mar 06 '15 at 11:07
  • Thanks a lot for this informations but I gues my SQL is to old for this :( 1up for your job – Dwza Mar 06 '15 at 16:47
  • @Dwza What version of MySQL do you have? You can run the following command within mySQL: SHOW VARIABLES LIKE 'version'; – twoleggedhorse Mar 06 '15 at 17:13
  • @Dwza What RDBMS are you using? I'll be able to get this working if I know what you are using and which version.... mysql, oracle, sql server, db2 etc – twoleggedhorse Mar 07 '15 at 07:22
  • Actually it's kind of difficult to say.. I'm working with an old programming language what is called Centura and there is also a build in sql. Sample statement for substring would look like: `select @substring(myColumn,0,5) from myTable;` used Programm to query the db is sqltalk. I have to write the company an have to ask them for a correct answere to this question. What I know, it's not mysql :) I'll Holla back as soon as I know... – Dwza Mar 07 '15 at 08:04
  • @Dwza Try the code now, the function INSTR is actually FIND in SQLBase. Think you need to use the @ symbol – twoleggedhorse Mar 09 '15 at 00:30
  • I will try tomorrow. I also tried to use case when but mostly got an error like `string was not properly ended` or something like that. Like I sayed, I will try tomorrow and hope I can solve this problem. I'm very thankful for this help... – Dwza Mar 09 '15 at 00:44
  • @Dwza I also looked up the CASE statement and I think you should try using IF instead, answer updated. – twoleggedhorse Mar 09 '15 at 00:48
  • 1
    this actually works very good... only the group by won't work... i get the error that `@IF is an invalid column name` .... And yes the @ is neede for some functions. only sum, avg and stuff like this wokrs without... i found [this here](http://support.guptatechnologies.com/Docs/SQLBase/sqlbase901sqllang.pdf) where you can see it... anyways... there is still the group by problem... :) – Dwza Mar 09 '15 at 10:17
  • in this doc I found this `You cannot perform an operation with the CURRENT OF clause on a result set that you formed with a GROUP BY clause. HAVING search condition The HAVING clause allows a search condition for a group of rows resulting from a GROUP BY or grouping columns. If a grouping column is an expression that is not an aggregate function (such as SAL*10), it cannot be used in the HAVING clause.` – Dwza Mar 09 '15 at 10:21
0

If you do not care about the other column meaning doing group by will select random values from other column in this case the art_name and it could be a random one from one of art_id you can use as

select 
substring_index(art_id,'-',1) as atr_id_val , 
art_name 
from tblArticle 
where art_type = 'MAT'
group by atr_id_val ;

If you need to select all the columns and need to make sure all the columns are from last entry of number-{1 or 2} you can use as

select 
t1.id,
substring_index(t1.art_id,'-',1) as art_id, 
t1.art_name , 
t1.art_color , 
t1.art_type from tblArticle t1 
left join tblArticle t2 on substring_index(t2.art_id,'-',1) = substring_index(t1.art_id,'-',1) and t1.id < t2.id and t1.art_type = 'MAT'  where t2.id is null ;
Abhik Chakraborty
  • 44,654
  • 6
  • 52
  • 63
  • what do you mean with "do not care about the other column values" ? and as you can see in my sample, not all art_id's got an minus in it – Dwza Mar 06 '15 at 11:13
  • I meant for `12345-1` there are 4 rows and while getting data using group by it will set `12345-1` as `12345` but the `art_name ` could be from any of the 4 rows. I have updated the answer check it. – Abhik Chakraborty Mar 06 '15 at 11:19
0

The art_id trimming needs to be included in select also I believe.

Select left( art_id, iif(CHARINDEX('-',art_id) > 0,CHARINDEX('-',art_id)-1, len(art_id) )) art_id, art_name 
    FROM tblArticle 
    WHERE art_type = 'MAT' 
    GROUP BY left ( art_id, iif(CHARINDEX('-',art_id) > 0,CHARINDEX('-',art_id)-1, len(art_id) ))
    ,art_name
Jose Tuttu
  • 418
  • 5
  • 15