Why is String.indexOf functioning like this?

Question

I'm trying to match some text based on a query that the user inputs. After encountering some issues, I found out this rather odd behaviour of String.indexOf that I simply cannot understand:

If I try to match a query without diacritics against a string with diacritics, it works: (not sure why)

"brezzel cu brânză".indexOf("bra")

11

But matching the same string with another letter after it, doesn't work:

"brezzel cu brânză".indexOf("bran")

-1

(tested both in Chrome & Firefox, same behaviour)

Is this a documented behaviour that I'm unaware of or what exactly is happening here?

`a` is not equal to `â`.. `brân` is in the string but `bran` is not in the string — The Bomb Squad, Nov 08 '20 at 22:48
in case these chars look the same to you(your display is STRANGE), run some code in a js console.. `console.log("a"=="â")` — The Bomb Squad, Nov 08 '20 at 22:51
`Array.from("brânză")` reveals what exactly is in the string. — Sebastian Simon, Nov 08 '20 at 22:52
That "a" character in your source string is comprised of the normal latin "a" plus the "combining circumflex accent" character, Unicode code point 770 (decimal). — Pointy, Nov 08 '20 at 22:54
Related: [Javascript - normalize accented greek characters](https://stackoverflow.com/q/23346506/4642212) and [Seemingly identical strings fail comparison](https://stackoverflow.com/q/16799810/4642212). — Sebastian Simon, Nov 08 '20 at 23:00

ibrahim tanyalcin · Accepted Answer · 2020-11-08T22:59:40.767

If I remember correctly, js characters are encoded in 2 bytes. But many other unicode chars encoded 4 bytes. Now the char â is 4 bytes. The first 2 bytes is a, thats why the first case works. Use the escape function to see:

escape("brezzel cu brânză")
"brezzel%20cu%20bra%u0302nza%u0306"

see that %20 is space, followed by bra and then you have %u0302 which together with previous a, encodes â.

Probably you can tell the rest. Test it if you want to:

'a' + String.fromCharCode('0x0302') //â

Why is String.indexOf functioning like this?

1 Answers1