2
> echo "fdp.txtUNE/ser/redaeR/daerorca/bil/rsu\nf.dpu" | sort -s 
fdp.txtUNE/ser/redaeR/daerorca/bil/rsu 
f.dpu 

Since "." is not a field separating character by default, the first 3 characters appear to say:

  • f <= f (that's fine)

  • d <= . (in ASCII, "." < "d", but I'm OK with sort deciding letters come before punctuation)

  • p <= d (this is problematic)

Even worse, if I remove one letter from the second string, the results are reversed:

> echo "fdp.txtUNE/ser/redaeR/daerorca/bil/rsu\nf.dp" | sort -s 
f.dp 
fdp.txtUNE/ser/redaeR/daerorca/bil/rsu 

What hideousness is going on here and how do I stop it? I thought "-s" would suffice, but apparently not.

From what I can tell, 'sort' thinks "f.dpu" > "fdp.t" because "u" > "t". However, that comparison should never be made, since characters before it already differ.

As a note, I get the same results without the "-s".

EDIT: setting environment variable LC_ALL to "C" fixes this, but it still bugs me that leaving LC_ALL (locale) blank yields inconsistent results (different is OK, inconsitent is bad).

  • Well, I'm echoing directly from the terminal, so I think not. –  Dec 11 '13 at 17:22
  • Duplicate of [sort not sorting as expected](http://stackoverflow.com/questions/5909404/sort-not-sorting-as-expected-space-and-locale). From the GNU FAQ: "[case is folded and punctuation is ignored](http://www.gnu.org/software/coreutils/faq/)"; spaces and punctuation are used only as tie-breakers. "fdptxtU" comes before "fdpu". – Raymond Chen Dec 11 '13 at 17:28
  • @RaymondChen You're right. I had my LANG environment variable set to "en_US.iso88591", which apparently has the same effect as the en_US.UTF-8 mentioned in the FAQ. I thought "-s" would stop the "tiebreaker" effect, but apparently not. Please write up your answer and I will approve it. It's frustrating sort doesn't have an "--ignore-local" option (although setting environment variable LC_ALL to C works), and that files sorted on one machine are unsorted on another. –  Dec 12 '13 at 12:52
  • OK, it turns out "sort -d" (dictionary order) might be consistent across machines. –  Dec 12 '13 at 13:05
  • It is not consistent between a US-English machine and a Swedish machine. If you want consistency, you have to make sure everybody is using the same locale. Feel free to write up your own answer and accept it. – Raymond Chen Dec 12 '13 at 14:58

1 Answers1

0

First, I had to turn on expanded echo:

$ shopt -s xpg_echo

Now, I'll run what you gave:

BASH 3.2:$ echo "fdp.txtUNE/ser/redaeR/daerorca/bil/rsu\nf.dpu" | sort -s
f.dpu
fdp.txtUNE/ser/redaeR/daerorca/bil/rsu

This is the correct sorted order the . is sorted before d, so f.dpu should come first.

I'm not getting your results. It could be because I'm on Mac OS X and not Linux. However, the -s option on both says "stabilize sort by disabling last-resort comparison". Are they any other shell options you have set that might be causing these issues?

Here's my shopt settings:

$ shopt
cdable_vars     off
cdspell         off
checkhash       off
checkwinsize    on
cmdhist         on
compat31        off
dotglob         off
execfail        off
expand_aliases  on
extdebug        off
extglob         on
extquote        on
failglob        off
force_fignore   on
gnu_errfmt      off
histappend      off
histreedit      off
histverify      off
hostcomplete    on
huponexit       off
interactive_comments    on
lithist         on
login_shell     off
mailwarn        off
no_empty_cmd_completion off
nocaseglob      off
nocasematch     off
nullglob        off
progcomp        on
promptvars      on
restricted_shell    off
shift_verbose   off
sourcepath      on
xpg_echo        on

The three differences from the default I have are exglob, lithist, and xpg_echo which I just set in order to get this to work.

Can you think of anything else that could be going on?

David W.
  • 105,218
  • 39
  • 216
  • 337