66

I see this code from a book:

var a = "one";
var b = "four";
a>b; // will return true

but it doesn't mention why "one" is bigger than "four". I tried c = "a" and it is smaller than a and b. I want to know how JavaScript compares these strings.

user2428118
  • 7,935
  • 4
  • 45
  • 72
patriot7
  • 663
  • 1
  • 5
  • 5
  • but when I compare "one" and "a", "one" is still the bigger one – patriot7 Aug 17 '11 at 04:34
  • 2
    `"one" > "a"` **because `"o" > "a"`.** Do you understand why `"o" > "a"`? – Matt Ball Aug 17 '11 at 04:48
  • 3
    No, I'm trying to search something. Would you please give me some tips? – patriot7 Aug 17 '11 at 04:51
  • 6
    The further along the alphabet, the higher the value. These are all true: `"z" > "y"; "y" > "c"; "c" > "b"; "b" > "a";` – Luke Nov 17 '14 at 23:01
  • 1
    Just a side comment, as stringified numbers ("1" "2" "3") works the same way, it drives to bad ordering as "5" is greater than "10". Then when you try to order numbers, verify that they are true numbers and not strings. – Harps Apr 10 '20 at 06:22
  • 1
    @Harps: are you alluding to "natural sorting", such as [Natural sort of alphanumerical strings in JavaScript](https://stackoverflow.com/questions/2802341/natural-sort-of-alphanumerical-strings-in-javascript), which "correctly" sorts the string "version10" after the string "version5" ? – David Cary May 30 '23 at 02:14
  • @DavidCary I was more suggesting a conversion of the array to numers (if nof mixed numbers with string). Usage of localeCompare is clearly smarter than conversion and prevent handling of conversion errors (but the result order could be not so intuitive). – Harps May 31 '23 at 14:31

5 Answers5

59

Because, as in many programming languages, strings are compared lexicographically.

You can think of this as a fancier version of alphabetical ordering, the difference being that alphabetic ordering only covers the 26 characters a through z.


This answer is in response to a question, but the logic is exactly the same. Another good one: String Compare "Logic".

Community
  • 1
  • 1
Matt Ball
  • 354,903
  • 100
  • 647
  • 710
  • 4
    When in doubt, dig out the spec. In this case, see section 11.8.5 of the [ECMAScript Spec.](http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf). Of course, for easier to parse answers, the SO archives works, too. :-) – mmigdol Aug 17 '11 at 04:49
  • 5
    The best explanation to explain the 'values' of characters is to point the user to an ASCII table that denotes the decimal value of each character. – ps2goat Nov 19 '14 at 15:15
  • @ps2goat good point. Feel free to suggest an edit to this answer to add that `:)` – Matt Ball Nov 19 '14 at 17:44
14

"one" starts with 'o', "four" starts with 'f', 'o' is later in the alphabet than 'f' so "one" is greater than "four". See this page for some nice examples of JavaScript string comparisons (with explanations!).

Paul
  • 19,704
  • 14
  • 78
  • 96
3

Javascript uses Lexicographical order for the > operator. 'f' proceeds 'o' so the comparison "one" > "four" returns true

martin
  • 2,493
  • 1
  • 20
  • 13
2

In the 11th edition of the ECMAScript Language Specification the "Abstract Relational Comparison" clause defines how to compute x < y. When the expression is reverted (i.e. x > y) we should compute the result of y < x instead.

So to solve "one" > "four" we must solve "four" < "one" instead.

The same clause says this:

The comparison of Strings uses a simple lexicographic ordering on sequences of code unit values.

And this if both operands are strings:

  1. If Type(px) is String and Type(py) is String, then
    • If IsStringPrefix(py, px) is true, return false.
    • If IsStringPrefix(px, py) is true, return true.
    • Let k be the smallest nonnegative integer such that the code unit at index k within px is different from the code unit at index k within py. (There must be such a k, for neither String is a prefix of the other.)
    • Let m be the integer that is the numeric value of the code unit at index k within px.
    • Let n be the integer that is the numeric value of the code unit at index k within py.
    • If m < n, return true. Otherwise, return false.

(We can safely ignore the first two points for this example)

So let's see the code units for "four":

[..."four"].map(c => c.charCodeAt(0));
//=> [102, 111, 117, 114]

And for "one":

[..."one"].map(c => c.charCodeAt(0));
//=> [111, 110, 101]

So now we must find a value for k (starting at 0) where both m[k] and n[k] are different:

|   | 0   | 1   | 2   | 3   |
|---|-----|-----|-----|-----|
| m | 102 | 111 | 117 | 114 |
| n | 111 | 110 | 101 |     |

We can see that at 0 both m[0] and n[0] are different.

Since m[0] < n[0] is true then "four" < "one" is true and thus "one" > "four" is true.


What does "☂︎" < "☀︎" return?

[..."☂︎"].map(c => c.charCodeAt(0))
//=> [9730, 65038]
[..."☀︎"].map(c => c.charCodeAt(0))
//=> [9728, 65038]
|   | 0    | 1     |
|---|------|-------|
| m | 9730 | 65038 |
| n | 9728 | 65038 |

Since 9730 < 9728 is false then "☂︎" < "☀︎" is false which is nice because rain is not better than sun (obviously ;).

customcommander
  • 17,580
  • 5
  • 58
  • 84
2

When you use a relational operator like <= with strings in JavaScript, you're comparing their underlying Unicode code units,¹ one at a time from the beginning, stopping the first time you find any difference. "one" > "four" is true because "o" (code unit 111) is greater than to "f" (code unit 102). Since a difference is found in the first character, the rest of the characters are ignored. If you had "fb" > "fa", the two "f"s would be compared, found to be the same, and then the next letter of each string ("b" and "a") would be compared. If the strings are different lengths and the longer one starts with the shorter one, the shorter one is "less than" the longer one ("aaa" < "aaab" is true).

This used to be covered by the Abstract Relational Comparison operation in the specification, but now it's the IsLessThan operation.


¹ The fact that the relational operators use code units is one good reason not to use them with strings, since the code unit order in many cases doesn't map well to people's expectations based in their language ("é" < "z" is false, which probably makes little sense to French speakers); instead, use localeCompare, perhaps with some optional settings to compare appropriately for the language the strings contain ("é".localeCompare("z", "fr") < 0 is true, because é comes before z in a proper lexicographical order in the "fr" locale).

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875