I'm trying to write a regex to scan a codebase to find all instances of a function call, and return a list of the arguments to the function.
The function is called "t" (I know, helpful eh) and it takes a single string which is a mix of lowercase letters (or digits) separated by .
, eg
t('foo.bar.baz')
t("chunky.chicken")
t('red.green.yellow.blue.1.2.3')
One immediate problem is to distinguish it from other function calls ending in t
, but I thought I could do that by preceding the regex with [^a-z]
, ie "not a letter".
Here's what I have so far, which I thought would work but doesn't, quite:
/[^a-z]t\(["']([a-z0-9_]+\.)+[a-z0-9_]+["']\)/
#with my thinking being as follows:
[^a-z] #not a letter
t #the character t
\( #open bracket
["'] #one instance of single or double quote
([a-z0-9_]+\.)+ #one or more instances of mix-of-letters-and-numbers-and-underscores followed by .
[a-z0-9_]+ #mix-of-letters-and-numbers-and-underscores
["'] #one instance of single or double quote
\) #close bracket
I'm matching it against this test line:
s = "<span style=\"color: #999;font-size: 0.5em;display: block;\"><%= t('cmw.letter.yumu_invite_for') %>:</span> <%= h(pupil.first_name.capitalize) %>"
and it's returning "letter."
whereas i want it to return " t('cmw.letter.yumu_invite_for')"
I think the problem is this part:
([a-z0-9_]+\.)+ #one or more instances of mix-of-letters-and-numbers-and-underscores followed by .
If i change it so that instead of looking for "one or more instances of this pattern", it's specifically looking for three segments, with . inbetween, then it works:
s = "<span style=\"color: #999;font-size: 0.5em;display: block;\"><%= t('cmw.letter.yumu_invite_for') %>:</span> <%= h(pupil.first_name.capitalize) %>"
regex = /[^a-z]t\(["'][a-z0-9_]+\.[a-z0-9_]+\.+[a-z0-9_]+["']\)/
s.scan(regex)
=> [" t('cmw.letter.yumu_invite_for')"]
So, I guess that the "multiple instances of this pattern" bit doesn't work like how I think it works?
This is in ruby but I think this might be a more general regex question.
EDIT - I just tried this in javascript and it works:
s.match(/[^a-z]t\(["']([a-z0-9_]+\.)+[a-z0-9_]+["']\)/)[0]
" t('cmw.letter.yumu_invite_for')"
So actually I think maybe this is a ruby question after all.