2

I searched and read a lot of answers on here, but can't find one that will answer my problem, (or help me to find the answer on my own).

We have a table which contains a varchar display field, who's data is entered by the customer. When we display the results, our customer wants the results to be ordered "correctly".

A sample of what the data could like is as follows:

"AAA 2 1 AAA"
"AAA 10 1 AAA"
"AAA 10 2 BAA"
"AAA 101 1 AAA"
"BAA 101 2 BBB"
"BAA 101 10 BBB"
"BAA 2 2 AAA"

Sorting by this column ASC returns:

1: "AAA 10 1 AAA"
2: "AAA 10 2 BAA"
3: "AAA 101 1 AAA"
4: "AAA 2 1 AAA"
5: "BAA 101 10 BBB"
6: "BAA 101 2 BBB"
7: "BAA 2 2 AAA"

The customer would like row 4 to actually be the first row (as 2 comes before 10), and similarly row 7 to be between rows 4 and 5, as shown below:

1: "AAA 2 1 AAA"
2: "AAA 10 1 AAA"
3: "AAA 10 2 BAA"
4: "AAA 101 1 AAA"
5: "BAA 2 2 AAA"
6: "BAA 101 10 BBB"
7: "BAA 101 2 BBB"

Now, the real TRICKY bit is, there is no hard and fast rule to what the data will look like in this column; it is entirely down to the customer as to what they put in here (the data shown above is just arbitrary to demonstrate the problem).

Any Help?

EDIT: learning that this is referred to as "natural sorting" has improved my search results massively I'm going to give the accepted answer to this question a bash and will update accordingly: Natural (human alpha-numeric) sort in Microsoft SQL 2005

Community
  • 1
  • 1
Sk93
  • 3,676
  • 3
  • 37
  • 67
  • 1
    There appears to be some consistency in what is typed in. This suggests that specific attribute values are being entered, correct? If so, they should really be split in to separate fields for the suer to write to, rather than a single text string. – gvee Feb 03 '15 at 13:28
  • But to solve the current conundrum we need some rules as to how the values should be interpreted e.g. "Will always start with 3 characters, then a space, then the next part, then a space". Armed with this information you can hit the data with a hammer to make it fit the requirement. – gvee Feb 03 '15 at 13:29
  • Afraid not - we cannot rely on any consistancy, which is why this has got my totally stumped! – Sk93 Feb 03 '15 at 13:29
  • All we have, for certain, is it is a varchar(100) field of data. It COULD contain numbers, those numbers COULD be seperated from the alpha characters... or not. The problems we get into when we let customers enter what THEY want eh? :/ – Sk93 Feb 03 '15 at 13:31
  • 1
    If the customer wants the data sorted a certain way, then the customer needs to provide the rules for how the data should be sorted. The rest is just string parsing. SQL doesn't come with magic pixie dust, I'm afraid. – Tab Alleman Feb 03 '15 at 13:37
  • They have come up with a way the data should be sorted.. "naturally sorted" I believe is the phrase. – Sk93 Feb 03 '15 at 13:45

2 Answers2

1

First create this function

Create FUNCTION dbo.SplitAndJoin
(
  @delimited nvarchar(max),
  @delimiter nvarchar(100)
) RETURNS Nvarchar(Max) 
AS
BEGIN

declare @res nvarchar(max)

declare @t TABLE
(
-- Id column can be commented out, not required for sql splitting string
  id int identity(1,1), -- I use this column for numbering splitted parts
  val nvarchar(max)
)

  declare @xml xml
  set @xml = N'<root><r>' + replace(@delimited,@delimiter,'</r><r>') + '</r></root>'

  insert into @t(val)
  select
    r.value('.','varchar(max)') as item
  from @xml.nodes('//root/r') as records(r)

  SELECT @res = STUFF((SELECT ' ' + case when isnumeric(val) = 1 then RIGHT('00000000'+CAST(val AS VARCHAR(8)),8) else val end
              FROM @t
              FOR XML PATH('')), 1, 1, '') 

  RETURN @Res
END
GO

This function gets an space delimited string and split it to words then join them together again by space but if the word is number it adds 8 leading zeros

then you use this query

Select * from Test
order by dbo.SplitAndJoin(col1,' ')

Live result on SQL Fiddle

enter image description here

Reza
  • 18,865
  • 13
  • 88
  • 163
0
  • Without consistency, you only have brute force
  • Without rules, your brute force is limited

I've made some assumptions with this code: if it starts with 3 alpha characters, then a space, then a number (up to 3 digits), let's treat it differently.

There's nothing special about this - it is just string manipulation being brute forced in to giving you "something". Hopefully it illustrates how painful this is without having consistency and rules!

DECLARE @t table (
   a varchar(50)
);

INSERT INTO @t (a)
  VALUES ('AAA 2 1 AAA')
       , ('AAA 10 1 AAA')
       , ('AAA 10 2 BAA')
       , ('AAA 101 1 AAA')
       , ('BAA 101 2 BBB')
       , ('BAA 101 10 BBB')
       , ('BAA 2 2 AAA')
       , ('Completely different')
;

; WITH step1 AS (
  SELECT a
       , CASE WHEN a LIKE '[A-Z][A-Z][A-Z] [0-9]%' THEN 1 ELSE 0 END As fits_pattern
       , CharIndex(' ', a) As first_space
  FROM   @t
)
, step2 AS (
  SELECT *
       , CharIndex(' ', a, first_space + 1) As second_space
       , CASE WHEN fits_pattern = 1 THEN Left(a, 3) ELSE 'ZZZ' END As first_part
       , CASE WHEN fits_pattern = 1 THEN SubString(a, first_space + 1, 1000) ELSE 'ZZZ' END As rest_of_it
  FROM   step1
)
, step3 AS (
  SELECT *
       , CASE WHEN fits_pattern = 1 THEN SubString(rest_of_it, 1, second_space - first_space - 1) ELSE 'ZZZ' END As second_part
  FROM   step2
)
SELECT *
     , Right('000' + second_part, 3) As second_part_formatted
FROM   step3
ORDER
    BY first_part
     , second_part_formatted
     , a
;

Relevant, sorted results:

a                    
---------------------
AAA 2 1 AAA          
AAA 10 1 AAA         
AAA 10 2 BAA         
AAA 101 1 AAA        
BAA 2 2 AAA          
BAA 101 10 BBB       
BAA 101 2 BBB        
Completely different 

This code can be vastly improved/shortened. I've just left it verbose in order to give you some clarity over the steps taken.

gvee
  • 16,732
  • 35
  • 50
  • As mentioned in the comment.. there is literally NO pattern. The data could be 1: "I want this to be number 2", 2: "I want this to be number 10", 3: "and this should be number 1 in the list"... the order should then be 3,1,2. it's exceptionally easy for the human mind to work this order out, and it seems naturally correct.. but there is no pattern, else I wouldn't have been stuck in the first place! :/ – Sk93 Feb 03 '15 at 13:48
  • @Sk93 is the pattern therefore: "find the first number that appears in the string, and sort by this"? – gvee Feb 03 '15 at 13:50
  • No. Again - there is NO pattern. they could easily add another row of "actually, instead of line 3 as previously suggested, I want this to be number 1".. and then this would come before the other three, as it's before them all both alphabetically and numerically. – Sk93 Feb 03 '15 at 13:52
  • @Sk93 then I'm afraid that you're out of luck. Computers are not smart. Neither are some users ;) – gvee Feb 03 '15 at 13:54
  • and if they then added "actually, instead of line 5 as previously suggested, I want this to be number 1", it wouldn't actually BE number one, as the previous entry above should come first, as 5 is higher than 3... – Sk93 Feb 03 '15 at 13:54
  • but you can do it in C# it appears.. http://stackoverflow.com/questions/248603/natural-sort-order-in-c-sharp – Sk93 Feb 03 '15 at 13:55
  • @Sk93 then your solution lies within your presentation layer (e.g. C#). Pass the data off to that to perform the sorting. – gvee Feb 03 '15 at 13:56
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/70141/discussion-between-gvee-and-sk93). – gvee Feb 03 '15 at 13:58