7

I'm curious why this doesn't work, and need to know why/how to work around it; I'm trying to detect whether some input is a question, I'm pretty sure string.match is what I need, but:

print(string.match("how much wood?", "(how|who|what|where|why|when).*\\?"))

returns nil. I'm pretty sure Lua's string.match uses regular expressions to find matches in a string, as I've used wildcards (.) before with success, but maybe I don't understand all the mechanics? Does Lua require special delimiters in its string functions? I've tested my regular expression here, so if Lua used regular regular expressions, it seems like the above code would return "how much wood?".

Can any of you tell me what I'm doing wrong, what I mean to do, or point me to a good reference where I can get comprehensive information about how Lua's string manipulation functions utilize regular expressions?

Uronym
  • 171
  • 2
  • 8

3 Answers3

13

Lua doesn't use regex. Lua uses Patterns, which look similar but match different input.

.* will also consume the last ? of the input, so it fails on \\?. The question mark should be excluded. Special characters are escaped with %.

"how[^?]*%?"

As Omri Barel said, there's no alternation operator. You probably need to use multiple patterns, one for each alternative word at the beginning of the sentence. Or you could use a library that supports regex like expressions.

kapex
  • 28,903
  • 6
  • 107
  • 121
  • Oh, thanks. I think it really confused me because patterns look a lot like regex, but a little different all the same. – Uronym Aug 21 '11 at 13:40
9

According to the manual, patterns don't support alternation.

So while "how.*" works, "(how|what).*" doesnt.

And kapep is right about the question mark being swallowed by the .*.

There's a related question: Lua pattern matching vs. regular expressions.

Community
  • 1
  • 1
Omri Barel
  • 9,182
  • 3
  • 29
  • 22
-1

As they have already answered before, it is because the patterns in lua are different from the Regex in other languages, but if you have not yet managed to get a good pattern that does all the work, you can try this simple function:

local function capture_answer(text)
  local text = text:lower()
  local pattern = '([how]?[who]?[what]?[where]?[why]?[when]?[would]?.+%?)'
  for capture in string.gmatch(text, pattern) do
    return capture
  end
end

print(capture_answer("how much wood?"))

Output: how much wood?

That function will also help you if you want to find a question in a larger text string

Ex.

print(capture_answer("Who is the best football player in the world?\nWho are your best friends?\nWho is that strange guy over there?\nWhy do we need a nanny?\nWhy are they always late?\nWhy does he complain all the time?\nHow do you cook lasagna?\nHow does he know the answer?\nHow can I learn English quickly?"))
Output:  
who is the best football player in the world? 
who are your best friends? 
who is that strange guy over there? 
why do we need a nanny? 
why are they always late? 
why does he complain all the time?
how do you cook lasagna? 
how does he know the answer? 
how can i learn english quickly?
Webrom
  • 23
  • 4
  • You're not capturing words, you're capturing ranges. Letters H-O-W or nothing, again same three letters or or nothing (e.g. zero or one time), etc, then whole bunch of anything and "?". Your function captures ANYTHING that ends with? Try `print(capture_answer("omgwtflol? extra"))` and get `omgwtflol?` back. – Oleg V. Volkov Mar 28 '21 at 23:46