-1

I have been looking for how to set together words that include either slash or hyphen. For instance, the following words:

re/max accessory source
re-max accessory
tri/max delivery
tri-max food

I am looking for setting together the words as long as they are in length max 3 in each side of hyphen or slash.

The output should be:

remax accessory source
remax accessory
trimax delivery
trimax food

I was expecting to apply look behind before the slash or hypen then, look ahead for the second part after slash. Something like the following:

(?<=[a-zA-Z]{1-3})\/(?=[a-zA-Z]{1-3})

Any help on this is highly appreciated.

John Barton
  • 1,581
  • 4
  • 25
  • 51
  • What would you like to obtain? Extract the 1st column? Is RegEx a requirement? – Pitto Dec 16 '19 at 21:56
  • 2
    The notation should be `{1,3}` This will work with the [PyPi regex module](https://pypi.org/project/regex/) and will match the `/` only – The fourth bird Dec 16 '19 at 21:56
  • Thanks @The fourth bird for the fix in the ranges. I added that fix, but it is still allowing words more than 3 letters. For instance, maximus/laptop company should not match the pattern while abc/xyz university should match the pattern – John Barton Dec 16 '19 at 22:03
  • 1
    Try using word boundaries `(?<=\b[a-zA-Z]{1,3})[/-](?=[a-zA-Z]{1,3}\b)` – The fourth bird Dec 16 '19 at 22:05
  • 1
    Perfect, It worked as expected. Thanks a lot. Please posted as solution – John Barton Dec 16 '19 at 22:08
  • @JuanPerez The question is marked as duplicate. You confirmed that it worked, I was also creating a Pyhon demo :) https://rextester.com/YPYFPC91855 – The fourth bird Dec 16 '19 at 22:13
  • When adding the regex to `text=re.sub(r'(?<=\b[a-zA-Z]{1,3})[/-](?=[a-zA-Z]{1,3}\b)','',text,0,re.IGNORECASE)`, I am getting error related to `look-behind requires fixed-width pattern`. I checked that this is common when having alternations, but this is not the case. – John Barton Dec 16 '19 at 22:25
  • @JuanPerez You can use the pattern using the [PyPi regex module](https://pypi.org/project/regex/) which supports the quantifier `{1,3}` in the lookbehind. But if you want to remove the `-` or `/` you could also use 2 capturing groups `text=re.sub(r'(\b[a-zA-Z]{1,3})[/-]([a-zA-Z]{1,3}\b)',r'\1\2',"re-max accessory",0,re.IGNORECASE)` See https://ideone.com/8NPmCD – The fourth bird Dec 16 '19 at 22:33

1 Answers1

0

I am not 100% sure I understood the requirement.

This is my attempt at extracting the 1st word of each line:

(^\w+[/|-]\w+)\s.*
Pitto
  • 8,229
  • 3
  • 42
  • 51