-2

I would like to ask for help regarding my problem when it comes to spoofing let say usernames and I want to catch them using regex.

for example the correct username is :

rolf

and here are the spoofed versions that I could think of:

roooolf
r123olf
123rolf123
rolf5623
123rolf
rollllf
rrrrrrolf
rolffff

So basically I have this regex expression ( that I know is not sufficient because I've tried it on regex101 website )

.+(?![rolf]).+

I'm using this as a baseline because it doesnt catch the correct username which is :

rolf

but it doesn't catch all the other "spoofed" versions of the username.

Any Ideas how can I make my regex more efficient?

Thanks in advance!

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 4
    We need a _clear_ description of which strings should be matched and which shouldn't. "spoofed" is not a clear description. – Aran-Fey Apr 10 '18 at 14:32
  • Sounds like [Regular expression to match a line that doesn't contain a word?](https://stackoverflow.com/questions/406230/regular-expression-to-match-a-line-that-doesnt-contain-a-word). – Wiktor Stribiżew Apr 10 '18 at 14:33
  • You could do [`r.*?o.*?l.*?f`](https://regex101.com/r/uLrHvD/1). It matches all your spoofs. – sshine Apr 10 '18 at 14:36
  • @SimonShine - it doesn't match those with leading and trailing numbers , ... but I presume your `.*` should be limited to numbers only `\d*`, adding them in between, as well as in front and at the end. – Ωmega Apr 10 '18 at 14:40
  • 1
    @Ωmega: It doesn't include the entire spoofed name in the match, but it does match spoofs with leading and trailing numbers (unless your regex engine has implicit `^` and `$` anchors). And yes, since "a spoof" isn't well-defined, maybe that's fine, too. – sshine Apr 10 '18 at 14:50
  • Hi , sorry for not being clear . The ones that shouldnt be matched is the correct one which is rolf all the other "spoofed" should be matched eg. rrrrrolf roooolf ro123lf213 so basically any username is not exactly "rolf" is considered wrong/ spoofed. I forgot to ask again if this is possible? I tried to put a negative look lookahead like this (?!rolf) but I think this is a wrong approach. Sorry for the confusion as well, I'm new to asking questions in stack overflow and I'll ask better and clearer questions next time. – Rolf Nufable Apr 10 '18 at 15:01
  • Do you mean ['^(?!rolf$).+$'](https://regex101.com/r/rKB0SI/1)'? – The fourth bird Apr 10 '18 at 15:10
  • 1
    @SimonShine - Yeah, the question is very unclear. We don't know what exactly "spoofed" means in this case, as well as we don't know if the line contains only username or it may contain additional text before and after. This question should be closed as unclear, I assume. – Ωmega Apr 10 '18 at 15:31
  • Hi sorry for the confusion again, Anything that is not of correct as mentioned on the question which is "rolf" is considered to be spoofed. Be it a line that contains the username that has additional text before and after and even in between hence the examples : rooolf rollllf roflffl ro123olfnio ro12lf I just dont wan't to catch the correct username which is : rolf (as for the example ) and as mentioned on my previous comment. I think (?!rolf) (with negative lookahead) is not a proper approach for this. Or is this problem cannot be solved using regex? – Rolf Nufable Apr 10 '18 at 15:43

3 Answers3

1

To match not exactly rolf You can use a negative lookahead (?! to assert that what follows from the beginning of the string is not 'rolf' until the end of the string.

^(?!rolf$).+$

That would match

  • ^ Assert position at the begin of the string
  • (?! Negative lookahead that asserts that what follows is not
    • rolf Match literally
  • ) Close negative lookahead
  • .+ Match any character one or more times
  • $Assert position at the end of the string

From your example regex you match .+ where @Ωmega has a fair point, matches spaces.

Instead of .+ you could specify what characters you might accept like \w+ for example to match one or more word characters or specify more using a character class.

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • 1
    @Ωmega Fair point! I have added some explanation to my answer. `.+` is what the OP used but for a username that should not be valid. – The fourth bird Apr 10 '18 at 15:41
  • 1
    We still don't know if the line contains only the username, or may contain some additional text and therefore we don't know if to use `^` and `$` or just `\b` word boundaries. I assume your and mine answers together answer this question regardless of that. – Ωmega Apr 10 '18 at 15:44
  • 1
    Hi, thanks for the answer. Sorry for the late reply as well as I'm on shift for work.. I'm testing your regex for all possible combinations that I think is a "spoof" I'll get back to you once done :) Thanks again! – Rolf Nufable Apr 10 '18 at 15:48
  • @Downvoter can you add a comment why so I can update my answer? – The fourth bird May 01 '18 at 15:00
1

You may try this too

(?m)^(?![^\n]*?rolf[^\n]*$).*$

Demo

Thm Lee
  • 1,236
  • 1
  • 9
  • 12
0

You can use a regex pattern

\b(?!rolf\b)\S+\b

\b Word boundary - Matches a word boundary position between a word character and non-word character or position (start / end of string).

(?! Negative lookahead - Specifies a group that can not match after the main expression (if it matches, the result is discarded).

\S Not whitespace - Matches any character that is not a whitespace character (spaces, tabs, line breaks).

+ Quantifier - Match 1 or more of the preceding token.


Test your inputs with this pattern here.

Ωmega
  • 42,614
  • 34
  • 134
  • 203
  • `\S+` may be changed to `[...]+` character class, where inside of `[` `]` you enter all allowed username characters, e.g. `[a-z0-9]`for all lowercase ASCII characters and all numbers. – Ωmega Apr 10 '18 at 15:41