I might be asking repeated question but I am not able to find solution to my problem so please spare!
I need to capture phrases enclosed in quotes through regex. That is easy , but problem arises when there is no uniformity of quotes like in the following case
'सीक्रेट सुपरस्टार'
and ‘ डॉन 2 ’
I tried using re.findall(r"['(.*?)' |‘(.*?)’] ",text)
. But this doesn't work out.
I need one regex to find phrases enclosed in different type of quotes.
Asked
Active
Viewed 50 times
1

Seema Mudgil
- 365
- 1
- 7
- 15
-
Remove whitespace and `[` and `]`. – Wiktor Stribiżew Aug 11 '17 at 09:51
-
This answer might help you https://stackoverflow.com/a/9523932/5513005 – Yash Karanke Aug 11 '17 at 09:54
1 Answers
1
You may use
(?:(')|(‘))(.*?)(?(1)'|(?(2)’))
See the regex demo.
Details
(?:(')|(‘))
- match and capture'
(put it into Group 1) or match and capture‘
(and put it into Group 2)(.*?)
- match any 0+ chars other than line break chars, as few as possible(?(1)'
- if Group 1 matched, match'
|
- else(?(2)’
- if Group2 matched, match’
))
- end of conditional construct.
See the Python 2.7 demo below:
rx = ur'''(?:(')|(‘))(.*?)(?(1)'|(?(2)’))'''
s=u"'सीक्रेट सुपरस्टार' and ‘ डॉन 2 ’"
for x in re.finditer(rx, s):
print(x.group(3).encode("utf8"))
Output:
सीक्रेट सुपरस्टार
डॉन 2

Wiktor Stribiżew
- 607,720
- 39
- 448
- 563
-
1thanks for the answer. But i need to add more conditions to check for the phrases like text enclosed in " सुपरस्टार " or some other type of quotes. With above solution I am able to capture only 2 condition . IS there a way to include multiple conditions? – Seema Mudgil Aug 16 '17 at 06:57
-
Yes, just add more capturing groups in the first `(?:...)` group as alternatives, add more checks to the conditional construct at the end. You might also try another way of matching the strings, like `["'‘](.*?)["'’]`. See [this Python demo](https://ideone.com/GKXCjL). Or even [`['"‘]([^'"‘]*)['"’]`](https://ideone.com/Y6Dazl). Check these regexes [**here**](https://regex101.com/r/5D4SpO/1). – Wiktor Stribiżew Aug 16 '17 at 07:02