0

I'm trying to find the entire word exactly using regex but have the word i'm searching for be a variable value coming from user input. I've tried this:

regex = r"\b(?=\w)" + re.escape(user_input) + r"\b"
if re.match(regex, string_to_search[i], re.IGNORECASE):
      <some code>...

but it matches every occurrence of the string. It matches "var"->"var" which is correct but also matches "var"->"var"iable and I only want it to match "var"->"var" or "string"->"string"

Input: "sword"

String_to_search = "There once was a swordsmith that made a sword"

Desired output: Match "sword" to "sword" and not "swordsmith"

E. Oregel
  • 321
  • 2
  • 4
  • 15
  • 4
    Please post your input and desired output. – Ajax1234 Aug 15 '17 at 15:52
  • Well, `\bvar\b` cannot match `var` in `variable`. Why are you using `re.match`? If you want to match user input as a whole string, you may use `regex = '{}$'.format(re.escape(user_input))` and then use `re.match()`. Else, if you need to really just find `var` as a whole word inside a larger string, you will need `re.search` with `\bvar\b` regex. – Wiktor Stribiżew Aug 15 '17 at 15:52
  • Doesn't python have a non regex function like a substring search ? –  Aug 15 '17 at 15:52
  • If python supported conditionals you could wrap it into conditional boundary's `(?(?=\w)\b)(?: your literal )(?(?<=\w)\b)` And this `\b(?=\w)` forces the literal to start with a `\w` –  Aug 15 '17 at 15:58
  • @sln yeah but it will find any substring including "var" in variable which i don't want. I'm going to try the .format or re.search and conditional – E. Oregel Aug 15 '17 at 16:06
  • Or, you could do it without a conditional `(?:(?=\w)\b|(?=\W))(?: your literal (?:(?<=\w)\b|(?<=\W))` –  Aug 15 '17 at 16:13
  • After your edit, it looks like your question is a dupe of https://stackoverflow.com/questions/180986/what-is-the-difference-between-pythons-re-search-and-re-match – Wiktor Stribiżew Aug 15 '17 at 16:16
  • @WiktorStribiżew Your solution worked! thanks! I'll post the solution – E. Oregel Aug 15 '17 at 16:38
  • What solution and why you if it is mine? – Wiktor Stribiżew Aug 15 '17 at 16:39
  • `value coming from user input` You know, if you think about it, the user shouldn't care _where_ the string is found right. It's only _you_ who thinks they can parse language using word boundary's.. Word boundary's are problematic, use _whitespace boundary's_ `(?<!\S)(?: your literal )(?!\S)`. Doing the math now, that's 2 solutions I gave you that just didn't register. –  Aug 15 '17 at 17:06

2 Answers2

3

You seem you want to use a pattern that matches an entire string. Note that \b word boundary is needed when you wan to find partial matches. When you need a full string match, you need anchors. Since re.match anchors the match at the start of string, all you need is $ (end of string position) at the end of the pattern:

regex = '{}$'.format(re.escape(user_input))

and then use

re.match(regex, search_string, re.IGNORCASE)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

You can try re.finditer like that:

>>> import re
>>> user_input = "var"
>>> text = "var variable var variable"
>>> regex = r"(?=\b%s\b)" % re.escape(user_input)
>>> [m.start() for m in re.finditer(regex, text)]
[0, 13]

It'll find all matches iteratively.

pkacprzak
  • 5,537
  • 1
  • 17
  • 37
  • Where is the output saved to in the loop? Because i would like to break on the first finding and say if it was found, do this, else, do that. – E. Oregel Aug 15 '17 at 16:10
  • @E.Oregel then just put this into the loop like this: `for m in re.finditer(regex, text):` and to inside the loop what you want. – pkacprzak Aug 15 '17 at 16:21