Many places on SO lead to the JLS section on Identifiers, but I have a question on what's written there.
The "Java letters" include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII underscore (_, or \u005f) and dollar sign ($, or \u0024). The $ character should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems. The "Java digits" include the ASCII digits 0-9 (\u0030-\u0039).
But it goes on to say:
Letters and digits may be drawn from the entire Unicode character set, which supports most writing scripts in use in the world today, including the large sets for Chinese, Japanese, and Korean. This allows programmers to use identifiers in their programs that are written in their native languages.
I don't understand how these can both be true. The first section seems to dictate exactly which characters are allowed whereas the second section seems to say that the allowance is much more flexible.
I agree that usage of "includes" instead of "includes but is not limited to" shows that it doesn't exactly contradict. But it also first refers specifically to "Java letters"/"Java digits" and then relaxes this to just "letters"/"digits". My main point is lack of clarity and I wanted confirmation on what I assumed it meant.