There has been a similar request logged on ST's issue tracker: https://github.com/SublimeTextIssues/Core/issues/1324
One of the ST developers replied:
Sorting in Python 3 uses Unicode code points as the basis for sorting. Sublime Text doesn't know what language your encoding represents, so it doesn't use locale-based sorting rules.
This seems like it is probably best solved by a package dedicated to providing locale-based collation rules.
Using https://packagecontrol.io/packages/PackageResourceViewer, we can see that the case_sensitive_sort
method in Packages/Default/sort.py
uses Python's built in list.sort
method. Typing the following into ST's Python console (View menu -> Show Console), we get the same result as you have shown:
>>> a = ['bår', 'bär', 'bör']
>>> a.sort()
>>> a
['bär', 'bår', 'bör']
So the answer is that there is no setting to configure the sorting behavior, and nor is there likely to be in future. However, according to https://docs.python.org/3/howto/sorting.html#odd-and-ends, it is possible to use a locale-aware sort using locale.strxfrm
as a key function.
Let's try. On Windows, I had to use
>>> import locale
>>> locale.setlocale(locale.LC_COLLATE, 'sve')
'Swedish_Sweden.1252'
to get Python to use a Swedish locale - as per https://stackoverflow.com/a/956084/4473405
>>> a.sort(key=locale.strxfrm)
>>> a
['bår', 'bär', 'bör']
Using this knowledge, you might choose to change the case_sensitive_sort
method, so that ST's built in sort functionality (Edit menu -> Sort Lines (Case Sensitive)) will use the locale aware sort key. Note that saving the sort.py
file opened from PackageResourceViewer will create an override, so that if future builds of ST include changes to sort.py
, you won't see them until you delete the override (which you can do by finding the file using the Preferences menu -> Browse Packages -> Default. You can reapply your changes afterwards, if appropriate, using the exact same steps.)
You can also change the case_insensitive_sort
method from
txt.sort(key=lambda x: x.lower())
to
txt.sort(key=lambda x: locale.strxfrm(x.lower()))
Note that, if your correct locale isn't picked up automatically (it probably defaults to C
), then setting the locale in this (case_sensitive_sort
) method isn't recommended, even if, immediately afterwards, you restore it back to what it was beforehand - so use at your own risk.
It is generally a bad idea to call setlocale()
in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.
You could instead add the following to the end of the sort.py
file:
def plugin_loaded():
import locale
locale.setlocale(locale.LC_COLLATE, '')
which will, when the plugin is loaded, allow Python to infer the locale from your LANG
env var, as per https://docs.python.org/3/library/locale.html#locale.setlocale. The advantage being you only set it once, and hopefully won't introduce any problems with other plugin code executing at the same time.
Happy sorting!