0

Is it possible to find regular expressions given matching examples?

examples = [
     "my name is Alice",
     "my name is Alex",
     "my good name is Bruce"
]

I need to find a possible regular expression programmatically which matches all the above examples. For the given example set above, python regex would be my (.*)name is (.*) or my (\w)name is (\w).

EDIT: I need to write a script which automates the extraction of regex for me.

Shivaprasad Bhat
  • 555
  • 1
  • 6
  • 13
  • try something like `my\\s?.*?\\s?name is (.*?)` – SMA Oct 23 '16 at 07:54
  • No, the question is how do i write a script which automatically extracts the regular expression when i present it with matching examples ?. You took a look at the examples and wrote a regex. I want a program to do that – Shivaprasad Bhat Oct 23 '16 at 07:56
  • Do you need to find the regex programmatically, meaning you want a piece of software to give your the matching regex? If that is so, maybe do `(my name is Alice)|(my name is Alex)|(my good name is Bruce)`. That will be easy to string together with code. – Tammo Heeren Oct 23 '16 at 07:57
  • That will fail if i try to match `my name is John` – Shivaprasad Bhat Oct 23 '16 at 07:58
  • `my name is John` is not in your examples. How should a piece of code know what portion of the sentence is the one that is allowed to vary? – Tammo Heeren Oct 23 '16 at 08:00
  • `.*` will match all of them – hek2mgl Oct 23 '16 at 08:03
  • It can know that, because it is presented with multiple examples. I can easily do this when the example set contains `my name is connor kelly`, `my name is alex`, `my name is bruce`. My script would generate regex `my name is (.*)` which would match `my name is john` as well. Since, in above example, `my good name is bruce` has an extra word `good` in between, i somehow need to render all the examples into having same length (length=no.of words here). – Shivaprasad Bhat Oct 23 '16 at 08:05
  • What you have in mind is impossible – hek2mgl Oct 23 '16 at 08:07
  • 1
    @ShivaprasadBhat [Here](http://stackoverflow.com/a/3141762/1192111)'s a possible solution to your problem (it's in ruby, so you'll have to translate it to Python) – Francisco Oct 23 '16 at 08:08
  • @hek2mgl It is not impossible. One answer is provided by Francisco Couzo – Shivaprasad Bhat Oct 23 '16 at 08:20
  • Did you *read*, *understood* and implemented that? Meaning did you solved your problem or not? – hek2mgl Oct 23 '16 at 12:54
  • Yes. I implemented it the same day i asked the question. I just uploaded the minimalistic version of the regex-utils i have been working on to github. https://github.com/shivylp/RegexUtils – Shivaprasad Bhat Dec 04 '16 at 05:56
  • P.S. I did not refer the link posted by Francisco Couzo. So i don't know the approach used there. I found my own. – Shivaprasad Bhat Dec 04 '16 at 06:13

2 Answers2

0

If it is just that, try:

regex = '(' + ')|('.join(examples) + ')'
Tammo Heeren
  • 1,966
  • 3
  • 15
  • 20
0

I implemented a solution myself. I made a small python package out of it and put it in my Github repo @ http://github.com/shivylp/RegexUtils

If anyone is looking for something similar, can use it.

Shivaprasad Bhat
  • 555
  • 1
  • 6
  • 13