0

I am using a JavaScript RegEx which is mentioned below:

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*([-_.])).+$

This accepts only that text which has at least 1: uppercase letter, lowercase letter, number & a special symbol from .-_.

Now assume I supply User-123 as the user id which confirms to the above RegEx & I use the onscreen keyboard to type in a character from Finnish language, which results in User-123Ã.

The RegEx being fulfilled, the text is accepted by my JavaScript code, but I want it to only accept Alphanumeric input in English, and nothing else.

How should I enhance this RegEx to do so?

  • Check the `.+$` - it allows all letters. You can use `[\w.-]+$` instead to restrict to the characters you require in the lookaheads. – Wiktor Stribiżew Apr 12 '16 at 09:50
  • Maybe it will be clearer to have a regex of the sort: `[^A-Za-z\d_.-]` and throw an error if that matches. – npinti Apr 12 '16 at 09:53
  • 1
    @WiktorStribizew, please start writing answers in the answers section. This is not the first time I wasted 2+ minutes reading a question that you have already answered in the comments. – ndnenkov Apr 12 '16 at 09:57
  • ndn - please post what you think is correct - I do not know what the real answer is. – Wiktor Stribiżew Apr 12 '16 at 10:00
  • 2
    I do not think just posting "Use `my_cool_code_or_regex`" is a valid answer. That is why I did not post my comment. Writing an answer takes time, and I do not always have lots of it. – Wiktor Stribiżew Apr 12 '16 at 10:12

2 Answers2

1

This string "User-123Ã" have contain Unicode "Ã" not alphabets, so how can identify js code,

[Code]  [Glyph] [Decimal]   [HTML]      Description                         [#]
U+00C3  Ã       Ã      Ã    Latin Capital letter A with tilde   0131

Try this link also, How to find whether a particular string has unicode characters

Community
  • 1
  • 1
Prabhat Sinha
  • 1,500
  • 20
  • 32
1

I am not sure this will solve the issue, but in most cases when you want to restrict the input itself to some characters, your consuming pattern should only match those characters you allow. The lookahead restrictions just require or forbid certain characters to appear certain number of times at certain positions, but what you match in the consuming part is crucial.

.+$ allows all letters. Replace it with [\w.-]+$ (\w = [a-zA-Z0-9_]) instead to restrict to the characters you require in the lookaheads.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563