I have a varchar
column in a MySQL database that stores some text which can contain any character and number.
The schema uses utf8
as the character set and utf8_unicode_ci
as the collation. InnoDB
is used as the database engine as I need to use transactions.
What I would like to do is to be able to ORDER BY
that column in a natural way. For example, here's some sample data:
12234 some random text
my text 23
mytext3
123456abcd
text23
text1
text111
text33
After inserting the test data into a test table, I run a query and ORDER BY
the varchar
column ASC
:
SELECT * FROM `test` WHERE 1 ORDER BY data ASC;
The problem is that the "size" of the numbers aren't taken into account:
12234 some random text
123456abcd
my text 23
mytext3
text1
text111 <-------
text23
text33
I then found this question here which provided an (almost) solution:
SELECT * FROM `test` WHERE 1 ORDER BY data * 1 DESC;
123456abcd <------ These 2 should
12234 some random text <------ be swapped
my text 23
mytext3
text23
text1
text33
text111
Another downside of the above is that the above could not use any indexes because there is an opreation on each row.
I am also not sure what the effects would be like once we introduce non-latin characters.
Is there a performant way to get MySQL to perform natural language sorting?