Similar to question How to sort words with accents?, I try to sort french words in a file on the shell, running MacOS Monterey with LANG=en_US.UTF-8
, LC_ALL and LC_COLLATE not set.
$ echo $'Bénéficiaires\néboueur\nComptabilité' > sample.txt
$ LC_ALL=C sort -fd sample.txt
Bénéficiaires
éboueur
Comptabilité
So the sort treats "é" like an empty char. Any way to fix this?
If I try sorting without LC_ALL=C
, I get:
$ sort -fd sample.txt
sort: string comparison failed: Illegal byte sequence
sort: Set LC_ALL='C' to work around the problem.
sort: The strings compared were ‘\303BOUEUR’ and ‘COMPTABILIT\303’.