Nested squared brackets

Question

I have a pattern like this:

word word one/two/three word

I want to match the groups which is seperated by the /. My thought is as follows:

[\w]+ # match any words
[\w]+/[\w]+ # followed by / and another word
[\w]+[/[\w]+]+ # repeat the latter

But this does not work since it seems as soon as I add ], it does not close the mose inner [ but the most outer [.

How can I work with nested squared brackets?

this is not how sets work in regex (`[]`). you probably want groups (`()`) — , Jun 11 '18 at 15:24
Wait...do you want to match the individual terms in `one/two/three`, or do you want to match the terms themselves? — Tim Biegeleisen, Jun 11 '18 at 15:26
Your incorrect idea of what square brackets mean is a very common FAQ. I suggest people stay away from square brackets entirely at least until they are reasonably familiar with basic regex. — tripleee, Jun 11 '18 at 15:39
You have a TARGET STRING like this `word word one/two/three word`. What do you specifically want to extract from that ? — , Jun 11 '18 at 16:06

Tim Biegeleisen · Answer 1 · 2018-06-11T15:36:24.377

0

Here is one option using re.findall:

import re
input = "word word one/two/three's word apple/banana"
r1 = re.findall(r"[A-Za-z0-9'.,:;]+(?:/[A-Za-z0-9'.,:;]+)+", input)
print(r1)

["one/two/three's", 'apple/banana']

Demo

edited Jun 11 '18 at 15:36

answered Jun 11 '18 at 15:24

Tim Biegeleisen

502,043
27
286
360

What if the string is `one/two/three's`, how do I change the regex to cover more than just `\w`? – user1406177 Jun 11 '18 at 15:33
@user1406177 The easiest way to do this might be to just add that punctuation to a character class, along with alphanumeric characters. – Tim Biegeleisen Jun 11 '18 at 15:36

score 0 · Answer 2 · answered Jun 11 '18 at 15:38

I suggest you the following answer using a simple regex very close from yours and from @Tim Biegeleisen's answer, but not exactly the same:

import re
words = "word word one/two/three's word other/test"
result = re.findall('[\w\']+/[\w\'/]+', words)
print(result)  # ["one/two/three's", 'other/test']

Nested squared brackets

2 Answers2

Demo