-1

I am looking for a string in the following format:

"5\'x5\'-ClimateControlled(xxxxxxx)IndooraccessUpperLevelElevatoraccess"

Where the (xxxxxxx) wildcard portion can be any combination of numbers, letters, or symbols. I have found that if the wildcard portion is a number, the following works:

pattern = '5\'x5\'-ClimateControlled.IndooraccessUpperLevelElevatoraccess'
regex = re.compile(pattern)
regex.findall(raw)

However, sometimes the wildcard portion is a string or symbols, in which case the search returns nothing. What is the syntax for a true wildcard search where the portion in the middle can be anything?

David Yang
  • 2,101
  • 13
  • 28
  • 46
  • I tthink you just need `r'\(([^()]*)\)'` – Wiktor Stribiżew Oct 30 '16 at 20:19
  • Would the pattern string then be:"5\'x5\'-ClimateControlled \(([^()]*)\)IndooraccessUpperLevelElevatoraccess"? – David Yang Oct 30 '16 at 20:20
  • Your question is unclear. If there are no round brackets, you may use `\S*` (zero or more non-whitespace chars) or even what JosefScript suggests - `.*`. – Wiktor Stribiżew Oct 30 '16 at 20:36
  • Try [`r"5\\'x5\\'-ClimateControlled((?:(?!5\\'x5\\'-ClimateControlled|IndooraccessUpperLevelElevatoraccess).)*)IndooraccessUpperLevelElevatoraccess"`](https://regex101.com/r/OuTCGp/1) (do not copy/paste from here as SO adds junk chars into the code in comments, the regex can be copied [from here](https://regex101.com/r/OuTCGp/1)) – Wiktor Stribiżew Oct 30 '16 at 22:34
  • If in fact there are no backslashes before single apostrophes, remove the backslashes from the pattern, too. – Wiktor Stribiżew Oct 31 '16 at 07:36
  • Any feedback, or shall we close the question as unclear? – Wiktor Stribiżew Nov 03 '16 at 11:57

1 Answers1

0
r'5\\\'x5\\\'-ClimateControlled.{0,10}IndooraccessUpperLevelElevatoraccess'

You have to escape "\" and " ' " by "\", if you have them in the original string. " .{0,10}" matches a chain of any characters with a length of zero to ten.

Alternatively you could use a negative lookahead:

r'5\\\'x5\\\'-ClimateControlled(?!5\\\'x5\\\'-ClimateControlled).*?IndooraccessUpperLevelElevatoraccess'
JosefScript
  • 581
  • 6
  • 15
  • Interesting, I think I know what's going on now. I've tried this .* method before, but my text contains multiple instances of 'IndooraccessUpperLevelElevatoraccess', and it's giving me the entire string between the first '5\\\'x5\\\'-ClimateControlled' and the last 'IndooraccessUpperLevelElevatoraccess'. I actually only want the strings that have less than 10 characters in between the first and second portion – David Yang Oct 30 '16 at 20:42
  • Well, then try it with non-greedy: `.*?` Edited my answer. – JosefScript Oct 30 '16 at 20:48
  • Hmmm, still not working as intended. It keeps returning me more than I need. If we split the search pattern into: 'X.....Y', it keeps returning me 'X.....X.....X.....X....Y', when all I want is the last 'X.....Y' – David Yang Oct 30 '16 at 21:09