I need a regex pattern which can detect if the given text is in English or not, but I want to include the following:
- Allowing spaces
- Allowing numbers and words
- Allowing multiple lines and tabs
- Allowing all special characters !@#$%^&*()_-+={}|/<>~`':";[]
- Allowing URLs, emails
- If the given text contains any character rather than English, it should be considered a non-English text, this should be applied if the text contains Arabic letters/words like "ا ب ت ... etc." and the same for French "é, â ... etc." and also all of the other languages
In brief, I need to know if the given text, any text with any format, is in English or not. I tried a lot of patterns but I didn't get it, and actually, I don't need to use any language detector as the application will be used offline.
Samples of the texts which should not be accepted:
Hello! ... é
مرحبا بك
للتحميل اضغط هنا ... http://www.google.com
So, if the text contains non-English letter, it should be considered non-English text.
But I need the help in getting these Unicode patterns. – Ahmed Negm Jun 04 '17 at 09:38