Questions tagged [oniguruma]

Oniguruma (鬼車) is a BSD licensed regular expression library which allows encoding (e.g. `UTF-8`, `EUC-JP`, `GB18030`) to be specified for each regular expression object. Use this tag for questions about Oniguruma regex syntax. Be sure to tag the language this library is used in.

Oniguruma (鬼車) is a BSD licensed regular expression library which supports Unicode characters in encodings such as UTF-8, UTF-16 and EUC-JP. It allows encoding to be specified for each regular expression object.

The Ruby programming language, since version 1.9, as well as PHP's multi-byte string module (since PHP5), use Oniguruma as their regular expression engine. It is usable in C/C++ and ports to Cocoa, Java and Erlang have also been made.

The current version is 5.9.4 (c) K.Kosako, updated at: 2013/04/04

Official Page - http://www.geocities.jp/kosako3/oniguruma/

54 questions
44
votes
1 answer

Regex to replace values that include part of match in replacement in sublime?

I've come up with this regex that finds all words that start with $ and contain _ underscores: \$(\w+)_(\w+) I'm basically searching for variables, like $var_foo etc. How do I replace stuff using the regex groups? For example, how can I remove the…
Alex
  • 66,732
  • 177
  • 439
  • 641
37
votes
3 answers

How to use a regular expression to remove lines without a word?

I am using textmate to edit a file. I would like to remove all the lines not containing a word. Here is an example. apple ipad hp touch pad samsung galaxy tab motorola xoom How can i remove all the line not containing the word "pad", and get this…
Victor Lam
  • 3,646
  • 8
  • 31
  • 43
14
votes
1 answer

Are Ruby 1.9 regular expressions equally powerful to a context free grammar?

I have this regular expression: regex = %r{\A(? a\ga | b\gb | c)\Z}x When I test it against several strings, it appears to be as powerful as a context free grammar because it handles the recursion properly. regex.match("aaacaaa") #…
Ken Bloom
  • 57,498
  • 14
  • 111
  • 168
12
votes
2 answers

Why does the =~ operator only sometimes have side effects?

I've noticed a side effect in Ruby/Oniguruma that is only present in 1 out of 4 seemingly equivalent statements. Why is the variable day defined in 009, but not in 003, 005 or 007? irb(main):001:0> r = /(?\d\d):(?\d\d)/ =>…
Staffan Nöteberg
  • 4,095
  • 1
  • 19
  • 17
8
votes
1 answer

Optimization techniques for backtracking regex implementations

I'm trying to implement a regular expression matcher based on the backtracking approach sketched in Exploring Ruby’s Regular Expression Algorithm. The compiled regex is translated into an array of virtual machine commands; for the backtracking the…
siracusa
  • 3,286
  • 1
  • 10
  • 19
7
votes
2 answers

Possessive generic quantifier {m,n}+ not implemented in Ruby 1.9.3?

Possessive quantifiers are greedy and refuse backtrack. A regex /.{1,3}+b/ should mean: Match any character except line breaks, 1 to 3 times, as many as possible and don't backtrack. Tthen match the character b. In this example: 'ab'.sub /.{1,3}+b/,…
Staffan Nöteberg
  • 4,095
  • 1
  • 19
  • 17
4
votes
1 answer

Regex with ? quantifier inside passive group?

I'm editing a TextMate grammar for SQL. It currently has the regex (keywords omitted for clarify): (?i:^\s*(create)\s+(aggregate|function|(unique\s+)?index|table)\s+)(['"`]?)(\w+)\4 This correctly matches a function definition like CREATE FUNCTION…
Jay Levitt
  • 1,680
  • 1
  • 19
  • 28
4
votes
2 answers

Grok/Oniguruma pattern to match first IP from X-Forwarded-For header

For this issue I'm trying to create a grok pattern, which matches the first IP from the X-Forwarded-For header in a nginx log. A log line typically looks like this: 68.75.44.178, 172.68.146.54, 127.0.0.1 - - [15/May/2017:12:16:27 +0200] "GET…
sepal
  • 43
  • 2
  • 5
4
votes
1 answer

Regex Syntax for making the last character Uppercase in TextMate

I want to uppercase the last character in a address field. So for "300 E.87th St. #4b" becomes "300 E.87th St. #4B", but "87th" should not change to 87Th. Can I do that in TextMate? If so, what's the syntax? Thanks.
hadenp
  • 445
  • 3
  • 9
3
votes
1 answer

How do I specify a valid character property using Oniguruma regexes?

I'm using the oniguruma gem to get unicode-aware regexes in ruby 1.8. According to the syntax documentation, I should be able to use \p{M} or \p{Mark} to match code points with the Mark property. However, when I do the following ORegexp.new…
Simon
  • 25,468
  • 44
  • 152
  • 266
3
votes
1 answer

Named subroutines in Oniguruma regex engine?

In Perl, you can do this: (?x) (?(DEFINE) (?dog|cat) ) (?&animal) In Ruby (Oniguruma engine), it seems that the (?(DEFINE... syntax is not supported. Also, (?&... becomes \g. So, you can do this: (?x) (?dog|cat) \g But of…
MikeC8
  • 3,783
  • 4
  • 27
  • 33
3
votes
2 answers

How do I use regex to capture the nth pattern on each line?

background: For syntax highlighting in Sublime Text, you can write a tmLanguage file with a corresponding tmTheme file. The tmLanguage file contains regular expressions in which you give names to, and then the tmTheme file uses those names to…
Trevor Hickey
  • 36,288
  • 32
  • 162
  • 271
3
votes
2 answers

Compiling Oniguruma regex library to javascript using Emscripten

I'm trying to get a more powerful regex library into javascript. The only solution I found is to compile Oniguruma regex library to javascript using Emscripten I've installed Emscripten and tested it with their small test scripts, also downloaded…
Allen Bargi
  • 14,674
  • 9
  • 59
  • 58
2
votes
2 answers

How can I test if an expression is valid for TextMate grammars in VS Code?

I am trying to use VS Code's tokenization engine for grammar injections and I don't understand why some regular expressions fail. For example, suppose I have the following text. VS Code, TextMate grammars, and Oniguruma regular expressions. Then,…
Mihai
  • 2,807
  • 4
  • 28
  • 53
2
votes
2 answers

Match all instances of a character preceded by '/'

for example, I might have the string /zombie nimble zombie quick Plants vs Zombies reference and I want to match every 'e' but only from the phrase "zombie nimble zombie quick", as it is preceded by a forward slash. I can get the contents of the…
Jam
  • 476
  • 3
  • 9
1
2 3 4