I've been working on a regex censor for quite the time and can't seem to find a decent way of censoring address links (and attempts to circumvent that).
Here's what I got so far, ignoring escape sequences:
([a-zA-Z0-9_-]+[\\W[_]]*)+(\\.|[\\W]?|dot|\\(\\.\\)|[\\(]?dot[\\)]?)+([\\w]{2,6})((\\.|[\\W]?|dot|\\(\\.\\)|[\\(]?dot[\\)]?)([\\w]{1,4}))*
I'm not so sure what might be causing the problem but however it censors the word "com" and "come" and pretty much anything that is about 3+ letters.
Problem: I want to know how to censor website links and invalid links that are attempts to circumvent the censor. Examples:
Google.com
goo gle .com
g o o g l e . c o m
go o gl e % com
go og le (.) c om
Also a slight addition, is there a possible way to add links to a white list for this? Thank you.