1

I have done some regex for 3 hours trying to figure this out.

I am trying to create an regular expression where I can find a group of specific words or numbers in any order, but this group can contain no other words or numbers unless it the selected words or number. Punctuation and spaces is fine

example:

the word I am searching for is one,three,four

one,three,four apple is red. should be match

four, three, one orange is orange. should be match

one,four,three the sky is blue. should be match

one,two,four,three should not be match because two is between the group of words

four,one,eight,green,three also not because eight and green is between the group of selected words

I have done some research and found that there is a way to match if a statement have a group of words in any order

Regex: I want this AND that AND that... in any order

The only problem with that is if I put a word that is not part of selected group in between the group that will also be consider match. I don't want that.

So I did some more research and found out how to selected specific words and found this

https://superuser.com/questions/903168/how-should-i-write-a-regex-to-match-a-specific-word

It works when one of the selected words are allow like the following below,

regex = (?:^|\W)one(?:$|\W)

I put "one" and it found one

I put "three two four one", It also find one

but when I put both expression together. I get this

^(?:^|\W)one(?:$|\W)(?:^|\W)three(?:$|\W)(?:^|\W)four(?:$|\W).*$

But it find nothing, what I do wrong?

three two four, It find nothing

The website I use is http://regexr.com/

Community
  • 1
  • 1
Richard Twitty
  • 120
  • 2
  • 8

2 Answers2

0

based on what you comment me Yes the delimiter is always a comma, you can use negative look-ahead assertion to prevent more than 3 , if exist like:

^\w+\s*(.)\s*\w+\s*\1\s*\w+(?:(?!\1).)*$

here I picked up the delimiter by using capture-group by use can use it directly like:

^\w+\s*,\s*\w+\s*,\s*\w+(?:(?!,).)*$

or:

with string literal:

^(?:one|three|four)\s*,\s*(?:one|three|four)\s*,\s*(?:one|three|four).*$

const regex = /^(?:one|three|four)\s*,\s*(?:one|three|four)\s*,\s*(?:one|three|four).*$/gm;
const str = `one,three,four apple is red.       match

four, three, one orange is orange. match

one,four,three the sky is blue.    match

one,two,four,three not match

four,one,eight,green,three no match`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}
Community
  • 1
  • 1
Shakiba Moshiri
  • 21,040
  • 2
  • 34
  • 44
0

A combination of alternation | and negative lookaheads (?!..) with backreferences \n should do it:

(one|three|four)\s*,\s*(?!\1)(one|three|four)\s*,\s*(?!\1|\2)(one|three|four)

You just match one legal value, then in the next sequence you do it again, but this time there's a negative lookahead to preclude what you already matched (which you represent with backreferences).

https://regex101.com/r/FwzBY9/1/

Scott Weaver
  • 7,192
  • 2
  • 31
  • 43