What you're asking appears to be whether there is some algorithm or existing library that takes an input string (like "12$abc@#EF345"
) and a set of matches (like ["12", "abc", "EF", "345"]
) and produces an "adequate" regex that would produce the matches, given the input string.
However, what does 'adequate' mean in this context? For your example, a simple answer would be: "12|abc|EF|345"
. However, it appears you expect something more like the generalised "\d+|[a-zA-Z]+"
Note that your generalisation makes a number of assumptions, for example that words in French, Swedish or Chinese shouldn't be matched. And numbers containing ,
or .
are also not included.
You cannot expect a generalised algorithm to make those kinds of distinctions, as those are essentially problems requiring general AI, understanding the problem domain at an abstract level and coming up with a suitable solution.
Another way of looking at it is: your question is the same as asking if there is some function or library that automates the work of a programmer (specific to the regex language). The answer is: no, not yet anyway, and by the time there is, there won't be people on StackOverflow asking or answering these question, because we'll all be out of a job.
However, some more optimistic viewpoints can be found here: Is it possible for a computer to "learn" a regular expression by user-provided examples?