3

I am sure that this must be quite a common problem so I would guess that Microsoft have already solved the problem. My Googling skills are just not up to scratch. I have a field that I want to order by, it is a varchar field, for example

  • Q
  • Num 10
  • Num 1
  • A
  • Num 9
  • Num 2
  • F

Now I would expect the result to be

  • A
  • F
  • Num 1
  • Num 2
  • Num 9
  • Num 10
  • Q

But it is not. It is as follows (Notice that Num 10 comes after Num 1 and not Num 9 as expected)

  • A
  • F
  • Num 1
  • Num 10
  • Num 2
  • Num 9
  • Q

Now I know the reason for this so you don't need to explain :) But I can't remember how to solve it or if there is a nice flag or command that I can use to get it right.

EDIT:

The examples above are just an example. The column could contain any value. Any combination of letters and digits. Is there a way to sort this humanly alphabetically instead of ASCII value alphabetically?

EDIT 2: Thanks for the answers so far. I am talking about ANY arbitary data. If it were in a fixed position or preceded by something then it would easy and I wouldn't be asking. I am asking for a general solution to this problem with ANY arbitary data. Not patterns, no rules, no nothing.

uriDium
  • 13,110
  • 20
  • 78
  • 138
  • 1
    Are the numeric values always preceded by the text "Num"? – Josh Anderson Jan 12 '10 at 20:34
  • Re Edit2: A general solution is possible in some cases, but there is no strict definition of *natural sort* that I am aware of. How do you expect it to perform? Supplying more complex use cases would help greatly. I have written a CLR function that seems to work well for **my** needs, see my edit below. – D'Arcy Rittich Jan 14 '10 at 12:35
  • @OrbMan. I was really hoping for something like SELECT * FROM x ORDER BY (NaturalIndex(col)) or something similar. I thought that maybe this problem had been solved by SQL Server and I just wasn't aware of the function. I will implement by solution in C# using links posted by Yada. Thanks anyway. – uriDium Jan 14 '10 at 13:21

4 Answers4

1

If the field always has the number at the end with possibly one word before it, and a space before it, you could use CHARINDEX/SUBSTRING to solve this.

Here is an example:

select *
from (
    select 'Q' x
    union
    select 'Num 10'
    union
    select 'Num 1'
    union
    select 'A'
    union
    select 'Num 9'
    union
    select 'Num 2'
    union
    select 'F'
) a
order by
    case
        when CHARINDEX(' ', x) <> 0 then LEFT(x, CHARINDEX(' ', x) - 1)
        else x
    end,
    cast(case
        when CHARINDEX(' ', x) <> 0 then substring(x, CHARINDEX(' ', x) + 1, LEN(x) - CHARINDEX(' ', x) )
        else ''
    end as int)

The output from this is:

A
F
Num 1
Num 2
Num 9
Num 10
Q

Edit:

Since your data is not consistent enough to use a hard-coded approach, the solution calls for more drastic measures. I have experimented with T-SQL based functions that will give a form of natural sort, but found them to be far too slow to be usable. Instead, I wrote a CLR based function and it performs very well. The function returns a scalar value that you can sort on. You'll find the code and installation instructions at over here.

Community
  • 1
  • 1
D'Arcy Rittich
  • 167,292
  • 40
  • 290
  • 283
1

This is an age old problem of Ascii Sort Order vs. Natural Sort Order

See Sorting for Humans : Natural Sort Order for further details.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Yada
  • 30,349
  • 24
  • 103
  • 144
  • I will be translating the following C++ http://www.stereopsis.com/strcmp4humans.html into C# and will process the result set in C#. I was hoping for a simple way to do it in SQL Server but I don't think that there is any. – uriDium Jan 14 '10 at 13:19
1

You added

The column could contain any value. Any combination of letters and digits

So, where do you want "foo1bar" and "foo10bar" for example? Or "foo10bar11" and "foo10bar1"? Or "Foo Two" and "Foo Three"?

There is no sensible solution without sensible data. You have random data. Define "human readable".

gbn
  • 422,506
  • 82
  • 585
  • 676
1

"I am asking for a general solution to this problem with ANY arbitary data. Not patterns, no rules, no nothing."

The problem is, programming is all about finding patterns, deriving rules from the patterns and applying solutions based on those rules. So without those prerequisites your question is pretty tough.

Essentially what you have to do is tokenize your sort string into chunks of pure letters and chunks of pure digits, and apply a different sort order to each category. That is doable providing you have some kind of pattern e.g.

   AAA999AA
   A9AAAAA
   A999A

but it would require a bespoke solution for each pattern. A general solution for any arbitrary arrangement of data is a big ask.

APC
  • 144,005
  • 19
  • 170
  • 281