According to RFC 3986: Uniform Resource Identifier (URI):
fragment = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
Unpacking all that, and ignoring percent-encoding, I find the following set of characters:
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-._~!$&'()*+,;=:@/?
Although the RFC does not mandate a particular encoding and deals in characters only (not bytes), according to Section 2.3 ALPHA
means ASCII only, i.e. the 26 letters of the Latin alphabet. Any non-ASCII letters must therefore be percent-encoded.