I am trying to write a regexp that will find strings that contain a combination of certain words but only those words and nothing else.
To give an example, let's say I want to find strings that contain any combination of red/green/blue separated by a comma, but I do not want to find anything that contains other colours.
VALID EXAMPLES
"red,green,blue"
"red"
"green,red"
INVALID EXAMPLES
"red,green,yellow"
"blue,pink"
I have gotten around this for now by using regexp_contains(string, 'red|green|blue')
and then not regexp contains then a list of other colours, however this only works with a finite list of possibilities (ie, when I know that the only values that could possibly be in the comma separated list are a specific subset of colours). Is there a way to say find me a string that is exactly any combination of these words and nothing else?
I am doing this in bigquery but can easily do this using Python or something else if bigquery regex does not support what is required.
Inb4, I am fully aware I can just deaggregate the data and filter, this is honestly more just about curiosity in terms of regex.