For example I have a string "anet.leverling@gmail.com" so regex pattern of this string would be "[a-t]{4}.[a-z]{9}@[a-z]{5}.[a-z]{3}" I want to know how to write a java code to generate that regex according to the given string.
-
1Did you try anything? How would _you_ as a human describe the rules to create those regular expressions to someone else? If you can't come up with rules (not caring about syntax) it'll be hard to code something like this. You'd also need more examples to explain what you're trying to do. Since 1000s of expressions would match that string how should the one you're after be found? – Thomas Dec 21 '22 at 07:08
-
can I generate a generalized regex for a string whether it matches different string also I just want a regex which would be perfect match for that string. – GulamMd Dec 21 '22 at 07:14
-
1`.*` matches that string. So does `anet\.leverling@gmail\.com` so do an astronomical number of other regexes. Which is "correct"? – tgdavies Dec 21 '22 at 07:44
-
This may help you https://stackoverflow.com/questions/201323/how-can-i-validate-an-email-address-using-a-regular-expression – tgdavies Dec 21 '22 at 07:50
1 Answers
In the general sense, you are asking for the impossible.
There is literally an infinite number of distinct regexes that would match that (single) example string. The specific regex you have provided is no more correct or incorrect than many others. (Indeed, all of them ... if you don't provide criteria for ranking the regexes for correctness!)
If you only provide a single example string, there is nothing to say whether the generated regex will work with "similar" strings, or exclude strings that are not "similar". And you haven't said what "similar" would mean.
Actually ... there is a fatuous solution to your requirements.
public String generatePattern(String input) {
return ".*";
}
The ".*"
regex will match any string, and therefore satisfies your requirement for a regex that will match your input string.
And of you want a regex that is a perfect match for only the input and nothing else, that is also fatuously simple.
public String generatePattern(String input) {
return "\Q" + input + "\E";
}
or just call Pattern.quote(input)
to get a Pattern
that encodes an equivalent regex.
To help you understand why your vague "according to the string" requirement is problematic, consider the following alternative regexes for that input.
[a-t]{4}.[a-z]{9}@[a-z]{5}.[a-z]{3} // Your version
[a-t]{4}\.[a-z]{9}@[a-z]{5}\.[a-z]{3} // We want '.' specifically
// rather than any single char
[a-z]{4}.[a-z]{9}@[a-z]{5}.[a-z]{3} // Why exclude uvwxyz from the
// the first 'name'
[a-t]{4}.[e-v]{9}@[a-z]{5}.[a-z]{3} // Why allow 'abcd' or 'wxyz'
// in the second 'name'
[a-zA-Z]{4}\.[a-zA-Z]{9}@[a-zA-Z]{5}.[a-zA-Z]{3} // uppercase letters
// OK in email addresses
Basically, if you want to say that your version is correct and the others are incorrect, then your requirements need to include a whole set of rules that allows the developer to distinguish the correct from the incorrect.
But if you do have those rules, then it is relatively straight forward to determine what pattern an example represents and generate the corresponding regex.

- 698,415
- 94
- 811
- 1,216
-
Actually you are correct but let's say I have a column which stores email address only then I want that my code should generate a regex that will be applicable for all the values of email column. Can I write a code to do this in java? – GulamMd Dec 21 '22 at 07:32
-
-
No I want "[a-t]{4}\.[a-z]{9}@[a-z]{5}\.[a-z]{3} " this type of regex only for email type – GulamMd Dec 21 '22 at 07:46
-
Well then, the answer is as I stated above. You need to specify clear rules for deciding what should and should not match. Once you have those rules ... yes you can write a regex generator. But frankly, this seems like using a sledgehammer to crack a walnut. It is simpler to just write the regex by hand. – Stephen C Dec 21 '22 at 07:48