1

I know this question sounds similar to others, but see my notes below on those solutions.

I need a regex to use in Python to search for specific words in any order in a string. What I want to do is put brackets around the series of words.

Here is my example text:

apples, pears, bananas are fruit

pears, apples, and bananas are in the fruit basket

you can use bananas, pears, and apples in a fruit salad

he also likes pears, bananas, and apples for a snack

she likes cheese for a snack but also apples or pears

Ultimately, I want to be able to put brackets around the fruit series, like so:

[apples, pears, bananas] are fruit

[pears, apples, and bananas] are in the fruit basket

you can use [bananas, pears, and apples] in a fruit salad

he also likes [pears, bananas, and apples] for a snack

she likes cheese for a snack but also [apples or pears]

In doing my research, I found the following posts:

Multiple words in any order using regex

This solution did nothing when I tested it in Regex101.

Regex to match string containing two names in any order

This solution is very similar to the first and did not work either.

Regex to match multiple words in any order

This solution came the closest to working with some slight modification:

(APPLES|BANANAS|PEARS) \[\1]\

However, this puts brackets around each fruit listed rather than around the series:

[apples], [pears], [bananas] are fruit

I'm obviously missing something, so I would appreciate whatever help someone could give me.

Thanks!

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Heather
  • 877
  • 1
  • 8
  • 24
  • `(?i)\b(?:APPLES|BANANAS|PEARS)\b(?:\s*(?:,\s*)?(?:(?:or|and)\s+)?(?:APPLES|BANANAS|PEARS))*\b` and replace with `[\g<0>]`, see https://regex101.com/r/RKv9ci/1 – Wiktor Stribiżew Jan 06 '21 at 16:33

1 Answers1

3

You may use this regex for search using case insensitive switch:

\b(?:apples|pears|bananas)(?:(?:\s+(?:or|and)\s+|\s*,\s*)+(?:apples|pears|bananas)\b)*

and replace this with: [\g<0>]

RegEx Demo

Code:

import re

regex = r"\b(?:apples|pears|bananas)(?:(?:\s+(?:or|and)\s+|\s*,\s*)+(?:apples|pears|bananas))*"

subst = r"[\g<0>]"

result = re.sub(regex, subst, test_str, 0, re.IGNORECASE)
anubhava
  • 761,203
  • 64
  • 569
  • 643