I've read a few of the other questions on R capture groups in regular expressions and i'm not having much luck.
I have a string:
127.0.0.1 - - [07/Dec/2014:06:43:43 -0800] \"OPTIONS * HTTP/1.0\" 200 - \"-\" \"Apache/2.2.14 (Ubuntu) PHP/5.3.2-1ubuntu4.24 with Suhosin-Patch mod_ssl/2.2.14 OpenSSL/0.9.8k mod_apreq2-20090110/2.7.1 mod_perl/2.0.4 Perl/v5.10.1 (internal dummy connection)\"
From which I am trying to capture a timestamp:
07/Dec/2014:06:43:43 -0800
The following function invocation returns a match:
regmatches(x,regexpr('\\[([\\w:/]+\\s[+\\-]\\d{4})\\]',x,perl=TRUE))
[1] "[07/Dec/2014:06:43:43 -0800]"
I've tried to capture the single group itself with str_match with varying varieties of this regex:
str_match(x, "\\[([\\w:/]+\\s[+\\-]\\d{4})\\]")
[,1] [,2]
[1,] NA NA
To no avail. Varying varieties of this regex test correctly in most of the online regex testers so I don't think the regex is the problem.
How can I get just the timestamp itself so I can pump it into strptime, without doing something like gsub
the brackets? gsub doesn't work to get the group for me, str_match doesn't work, what am I missing? The ideal output would be
07/Dec/2014:06:43:43 -0800
which I could then use in strptime.