1

MDN states that Array.prototype.sort() sorts by UTF-16.

"The default sort order is ascending [...] comparing their sequences of UTF-16 code units values."

I need to sort by UTF-8 (or Unicode or UTF-32 as their order is the same). Is there a good way to do this in the browser?

UTF-16 maddeningly has some points out of order. That source suggests a "fix-up" rotation of the out-of-order code points, but I've not seen this widely acknowledged or implemented in the Javascript world.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort

Zamicol
  • 4,626
  • 1
  • 37
  • 42
  • 1
    Does this answer your question? [How to overload comparator to sort with UTF-8 and different locales](https://stackoverflow.com/questions/41242116/how-to-overload-comparator-to-sort-with-utf-8-and-different-locales) – Heretic Monkey May 07 '21 at 21:12
  • While on the topic of the issues with Javascript and UTF-16: JSON requires UTF-8 encoding. Early JSON RFCs (like 4627, 7159) permitted UTF-8, UTF-16, or UTF-32. The latest JSON RFC 8259, published in 2017, [requires only UTF-8](https://datatracker.ietf.org/doc/html/rfc8259#section-8). – Zamicol Nov 10 '22 at 02:34
  • You can do this with `Buffer.from`, `Array.prototype.sort`, and `Buffer.compare`, if you’re using Node.js. – Константин Ван May 10 '23 at 22:35

1 Answers1

1

To do that you would have to write your own sorting method, like here.

But JS has no built-in UTF-8 sorting method, and I couldn't find any libraries.

Doumor
  • 447
  • 3
  • 11