-1

I have a table with a product name column, in which a product number exists. Product name is not of unique style to extract product number from that. Following are some examples of expected product number from product name:

product name and product number

Cranfield 29-7                   -->   29-7 
LCFU  23-3  ( MAYE S BUSHY UNIT) -->   23-3
CORE TEST         D              -->   D
BFU 33-8  (ROBERT H BOLING)   1  -->   1
Gwinville 14-12     3            -->   3  

So, my product name is of notunique style, with no fixed delimiters if you observe. I need a function which takes product name and returns product number. Remember, both the product name and product number of the varchar datatype. Product number can be alphabetic, numeric or alphanumeric based on product names.

If any body tells logic also, I would be grateful.

Andre
  • 26,751
  • 7
  • 36
  • 80
Velugoti
  • 59
  • 1
  • 6
  • 1
    You can extract numbers like `29-7`, `23-3`, etc. but how you can extract something like `D`? – Siyavash Hamdi May 19 '16 at 10:37
  • J. D. Robertson C-1 ---> product no should be C-1 – Velugoti May 19 '16 at 10:41
  • I hope you are doing this in order to fix your data model and have separate columns for product number and name in the future. – Thorsten Kettner May 19 '16 at 10:44
  • BFU 33-8 (ROBERT H BOLING) 1 : observing the other examples, I thought the prod number is 33-8? And is it correct, that the product number can be at the end of the string or within the string? So you have no fix format and no fix position of the product numbers? In this case I'd say it is NOT possible to automatically extract the product numbers. – Tyron78 May 19 '16 at 10:45
  • And what is your question? What string functions are available in SQL Server to help you here? Or what would be a good algorithm? – Thorsten Kettner May 19 '16 at 10:48
  • if the product number in the product name column is separated with a tab then u can try to use the regex of something like `'%[^\t]%'` and extract the number. – Srinivas V. May 19 '16 at 10:48
  • We only see examples. Is it always: "name with zero or more blanks" - blank - "number without blanks and parentheses" - optional blank - optional text in parentheses - optional further blanks? – Thorsten Kettner May 19 '16 at 10:48
  • @Rene,These are the expected results given by my QA team. – Velugoti May 19 '16 at 10:49
  • @ThorstenKettner,Actually my table already having productname,product no columns.Current logic is not parsing above examples properly.returing Empty or NUll .So Now I need extract prod no from product name only when when LEN(LTRIM(RTRIM(PRODNO)=0 – Velugoti May 19 '16 at 10:53
  • @ThorstenKettner ,I need good algorithm to extract product no ,keeping diff formats in view.I feel it should be different case statements and string functions to manipulate the data – Velugoti May 19 '16 at 10:57
  • OK, but you will need atleast some form of pattern... like "if product type / group / vendor A, then format = xx-x; if B then format = x; if C then alphanumeric"... – Tyron78 May 19 '16 at 11:03

2 Answers2

2

You cannot build the query without an algorithm.

And you can only build the algorithm when you know the string format.

It seems however, you don't know the format, but only examples. A working algorithm for the examples would be:

  1. remove space from right if any
  2. remove text in parentheses from the right if any
  3. remove space from right if any
  4. extract string from right till you hit a blank

But while this works for your examples, it is in no way guaranteed to work with any string in your database.

Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
2

This was an approach to get all parts neatly separated. It's on you to find some logic how to interpret this.

Credits to this for the approach to get rid of multiple blanks.

EDIT: Added some more replaces to allow characters like &, > and <

DECLARE @tbl TABLE(ID INT,Test VARCHAR(MAX));
INSERT INTO @tbl VALUES
 (1,'Cranfield 29-7')
,(2,'LCFU  23-3       ( MAYE S BUSHY UNIT)')
,(3,'BFU 33-8  (ROBERT H BOLING)         1')
,(4,'Gwinville 14-12     3')
,(5,'Procter&Gamble 14-12     3')
,(6,'This->works 14-12     3')
;

WITH Splitted(ID,AsXml) AS
(
    SELECT ID
          ,CAST('<x>'+ REPLACE(REPLACE(REPLACE(
           REPLACE(
              REPLACE(
                 REPLACE(
                    LTRIM(RTRIM(REPLACE(REPLACE(REPLACE(Test,'&','&amp;'),'<','&lt;'),'>','&gt;'))), 
                 '  ',' |'),                 
              '| ',''),                        
           '|',''),'(','<paran><x>'),')','</x></paran>'),' ','</x><x>') + '</x>' AS XML)
    FROM @tbl
)
SELECT p.*
FROM
(
    SELECT ID
          ,'part_' + CAST(ROW_NUMBER() OVER(PARTITION BY ID ORDER BY (SELECT NULL)) AS VARCHAR(10)) AS colName
          ,A.B.value('.','nvarchar(max)') AS Part
    FROM Splitted
    CROSS APPLY AsXml.nodes('/x') AS A(B)
) AS tbl
PIVOT
(
    MIN(Part) FOR colName IN(part_1,part_2,part_3,part_4,part_5,part_6)
) AS p

The result

+----+----------------+--------+----------------+--------+--------+--------+
| ID | part_1         | part_2 | part_3         | part_4 | part_5 | part_6 |
+----+----------------+--------+----------------+--------+--------+--------+
| 1  | Cranfield      | 29-7   | NULL           | NULL   | NULL   | NULL   |
+----+----------------+--------+----------------+--------+--------+--------+
| 2  | LCFU           | 23-3   | MAYESBUSHYUNIT | NULL   | NULL   | NULL   |
+----+----------------+--------+----------------+--------+--------+--------+
| 3  | BFU            | 33-8   | ROBERTHBOLING  | 1      | NULL   | NULL   |
+----+----------------+--------+----------------+--------+--------+--------+
| 4  | Gwinville      | 14-12  | 3              | NULL   | NULL   | NULL   |
+----+----------------+--------+----------------+--------+--------+--------+
| 5  | Procter&Gamble | 14-12  | 3              | NULL   | NULL   | NULL   |
+----+----------------+--------+----------------+--------+--------+--------+
| 6  | This->works    | 14-12  | 3              | NULL   | NULL   | NULL   |
+----+----------------+--------+----------------+--------+--------+--------+
Community
  • 1
  • 1
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • By the way from 2016 split is built-in [demo](https://data.stackexchange.com/stackoverflow/query/488118) – Lukasz Szozda May 19 '16 at 11:18
  • @lad2025 I know, but it's tagged with SQL Server 2012 – Shnugo May 19 '16 at 11:19
  • @lad2025 and by the way: This XML-split has one advantage. As I replace the paranthesis separately, the content *within* is returned *as one*. The built-in Split would surely split the internal blanks too :-) – Shnugo May 19 '16 at 11:22