2

I have a string column in my database and the fields are a mixture of numbers and dashes. Example 24-2548-25.

I would like to sort by the center set of numbers as integers because of padding issues, such as 24-25-25 and 24-1111-25 don't sort numerically but alphabetical.

I think I need something like what is below but the syntax isn't even close to correct.

SELECT 
    tblDrawings.* 
FROM 
    tblDrawings 
WHERE 
    (((tblDrawings.AreaNo) = "21")) 
ORDER BY 
    SET @DashInt = PATINDEX('%-%', tblDrawings.DrawingNo);
    SET @SecondDashInt = CHARINDEX('-',tblDrawings.DrawingNo,@DashInt + 1)
    IF @DashInt > 0{CAST(SUBSTRING(tblDrawings.DrawingNo , IF @SecondDashInt > 0 {@DashInt + 1 ,@SecondDashInt} ELSE {LEN(tblDrawings.DrawingNo)} AS INT))} ELSE {DrawingNo};
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Mobile_Bob
  • 23
  • 2
  • 1
    Yakshemash. Can you give us a more information - how many rows in this table - fixed/increasing, 1000's/Millions ? Also, do you only want to sort by the number between two dashes ? – Erran Morad Feb 05 '14 at 05:37
  • @Borat 100s, it will be increasing but slowly and will have a soft limit because of the use case. I would say it won't get past two thousand. The data unfortunately will not always have two dashes, nor is it forced to only be numbers. There might be times where a letter is entered at the end of the number set. I was trying to build some of that variability into my code above with the If statements. – Mobile_Bob Feb 05 '14 at 23:42
  • If possible I would like it to sort ###-$$$$-%%%% (Not the constant but most probable format). Where # is sorted first then $ and lastly %. I guess i'm asking for a lot in an nonuniform data set... My other option is to dump the entire result into my code then sort through it fixing the issues as I walk through the rows. Put it back into a temp table then select against the new table. Doesn't seem like the best option. (I'm coding in ASP using VBS, don't laugh it's an enhancement job). – Mobile_Bob Feb 06 '14 at 00:11

1 Answers1

0

Provided that the drawing number always consists of 3 parts, and assuming that you just want to sort by the middle component numerically ascending, you can (ab)use the PARSENAME function as mentioned here to expedite splitting up the components of the numbers (The dashes need to be replaced by dots to trick the function into thinking it is a fully qualified name):

SELECT 
    tblDrawings.* 
FROM 
    tblDrawings 
WHERE     
    (((tblDrawings.AreaNo) = "21"))     
ORDER BY CAST(PARSENAME(REPLACE(DrawingNo, '-', '.'), 2) AS BIGINT) ASC;

Fiddle here

(Parsename only works with up to 4 parts, and ordering is from right to left)

Edit

You are probably going to have to analyze all of the data and work out the exact ordering required before your query can be finalized. Some ideas, FWIW:

  • The parsename hack won't work if you have a variable number of components - suggest you look at pulling a component splitting UDF across to your db like this one
  • From a sanity and DRY point of view, you probably don't want to do all of this in one step - a CTE will make life easier as you pass the output of one transformation of data to the next.
  • You can try a padding technique like below to try and address mixed numeric / non-numeric ordering. To accommodate the alpha suffixes, I'm guessing however that the padding needs to be both left and right of the original value, with the right padding reserved for non-numeric suffixes padding to the right
  • You can do conditional ordering using a ORDER BY CASE WHEN ... THEN ... ELSE ... END, although note that all branches must return the same type.

Example of the padding idea 'objective':

xxx1x
xx11x
x111x
x111A
x112x

WITH cte as
(
  SELECT 
    *, 
    PARSENAME(REPLACE(DrawingNo, '-', '.'), 3) as [3FromRight],
    PARSENAME(REPLACE(DrawingNo, '-', '.'), 2) as [2FromRight], 
    PARSENAME(REPLACE(DrawingNo, '-', '.'), 1) as [1FromRight]
  FROM tblDrawings
)
SELECT * 
  FROM cte
  ORDER BY
    REPLICATE('x',12-LEN([2FromRight])) + [2FromRight] + REPLICATE('x', 1)
    ASC;

The number of replicated x's in "12-" and "1" would need to be adjusted to count the number of numeric and non-numeric chars and then adjust the padding before and afterwards. So using another query nesting / cte would probably needed to count the number of chars in the split components.

Community
  • 1
  • 1
StuartLC
  • 104,537
  • 17
  • 209
  • 285
  • I like this idea and will play with it, but unfortunately there are some without two dashes. Also there is no guarantee that it will be limited to numbers only. I was hoping to build some flexibility into it with IF statements like I was using in my code above. Not even sure if I can use IF statements in ORDER BY like I was trying to do. – Mobile_Bob Feb 05 '14 at 23:44
  • @Mobile_Bob I guess I inferred from your examples in the OP that the number of dashes was constant and the second component was numeric. I've updated with some ideas, but it probably will be quite an iterative process to get this working 100%. Good luck! – StuartLC Feb 06 '14 at 05:03