2

Is there an invisible (Unicode) character that affects the alphabetical sorting of list entries?

The general question of invisible characters has been addressed here before, and here is a list of invisible characters, most of which seem to be blanks of some kind.

I am looking for an invisible character which can be placed e.g. at the beginning or inside a filename, and then makes that file getting sorted alphabetically on top with respect to other filenames.

I should add that the character I'm looking for should be sorted before a normal SPACE character as well, for example when used in a file name that is sorted in places like Sharepoint, Teams and OneDrive in the browser.

David.P
  • 196
  • 1
  • 9
  • What it is "alphabetical sorting"? There is no one true sorting, so it depends on which algorithm you use (and possibly with the language you are dealing: alphabetical sorting strictly depend on languages, also on languages with Latin scripts). – Giacomo Catenazzi Nov 03 '21 at 17:15

1 Answers1

3

U+0020 SPACE (" ") is invisible and sorts before all visible characters in most orderings. It's been used to order things to the start of lists for decades.

Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • Thanks. When thinking about this, I actually should add that the character I'm looking for should sort before a normal SPACE as well. – David.P Nov 03 '21 at 17:49
  • 1
    U+001F (US) generally orders before U+0020. Also U+001E (RS), U+001D (GS), down to U+0000 (NUL). Most of them are also zero-width. You might want to avoid CR, LF, and BS, since they often move the cursor, and some algorithms may have trouble with NUL (even though it's a legal character). With ASCII ordering, characters are sorted by their numeric value. – Rob Napier Nov 03 '21 at 18:01
  • Note that if you want Window Explorer in particular, you'll need to test against Windows Explorer. There is no promise that all systems will sort in the way you're picturing, since it does not line up with any language or culture. You just want a hack that makes a particular piece of software behave in a certain way. I *expect* that low-value ASCII code points will sort lower than other code points in Windows Explorer, but at that point, you're outside of any spec. But if you want "the rules" for Unicode, they're here: http://www.unicode.org/reports/tr10/#Collation_And_Code_Chart_Order – Rob Napier Nov 03 '21 at 19:47
  • (There's just no promise whatsoever that Windows Explorer carefully follows the Unicode Collation Algorithm Collation Order. But it *probably* does for ASCII.) – Rob Napier Nov 03 '21 at 19:50
  • Thanks everyone. I should have added that this is actually about a file list in Sharepoint, not Windows Explorer. I tried Rob's U+001F, U+0020, U+001E, U+001D and U+0000, but they all seem to come out as visible squares like this: and don't seem to change the sort order. – David.P Nov 03 '21 at 20:33