1

Is there a way to make similar values equivalent in an order by?

Say the data is:

name  | number
John. | 9
John  | 1
John. | 2
Smith | 4
John  | 3

I'd want to order by name and then number, so that the output looks like this, but order by name, number will put all John entries ahead of John. entries.

name  | number
John  | 1
John. | 2
John  | 3
John. | 9
Smith | 4
st2 tas
  • 91
  • 1
  • 12

3 Answers3

0

You need some more advanced treatment on the name field then. This topic will help you strip non-alphabetic characters from your string before ordering: How to strip all non-alphabetic characters from string in SQL Server?

But the fact that you need such a complex function makes me question the very building process of your Database: if "John" and "John." are the same person, they should have the same name. So if the "." is important, that means you need another field to store the information it represents.

0

There's Fuzzy String Matching and beyond that, Pattern Matching.

Mikael Gueck
  • 5,511
  • 1
  • 27
  • 25
0

Use a regex replace function to strip out all special characters in your data, replacing with a space. Then wrap that in a TRIM function to remove the spaces

  TRIM(CASE 
   WHEN name LIKE '%.%'
   OR name LIKE '%_%'
   OR name ~ '%\d%' --This is for a number
    THEN
      REGEXP_REPLACE(name, '(\_|\.|\d)', ' ' ) END) AS name_processed

The bit in brackets means replace an underscore or (|) a period or a digit with whatever is after the comma, which here is a space

Now you can order by name_processed and number as well

 ORDER BY name_processed, number DESC

But you can always keep the original name in a SELECT afterwards if you wrote a subquery first through WITH. Let me know if you want to do this. Basically the syntx would be:

WITH processed_names AS (
SELECT
 name,
 TRIM(CASE 
   WHEN name LIKE '%.%'
   OR name LIKE '%_%'
   OR name ~ '%\d%' --This is for a number
   THEN
      REGEXP_REPLACE(name, '(\_|\.|\d)', ' ' ) END) AS name_processed,
number
 FROM names
  ORDER BY 2,3 DESC)

SELECT
name,
number
 FROM processed_names;