3

I'm trying to get a more powerful regex library into javascript. The only solution I found is to compile Oniguruma regex library to javascript using Emscripten

I've installed Emscripten and tested it with their small test scripts, also downloaded oniguruma source code, but still don't know what should be done next.

Anyone familiar with emscripten?

Allen Bargi
  • 14,674
  • 9
  • 59
  • 58
  • don't understand the down votes! did I do anything wrong? is the question not suitable for stackoverflow or the tags are inappropriate? what's wrong? – Allen Bargi Nov 05 '12 at 09:05
  • It's too specific, and not attractive to other users. It's more suitable to be asked in emscripten's mail list. – xiaoyi Nov 15 '12 at 13:15
  • having better regular expressions in the world's most widely distributed programming language is anything but 'unattractive', rather, it is a highly relevant endeavor. – flow Apr 07 '13 at 20:48
  • yeah; other users? You mean the majority? the ones looking for "how to sum an array in JavaScript" and other extremely innocuous questions? – Ivan Castellanos Jan 10 '14 at 20:44
  • Hey Allen, have you managed to compile the oniguruma with emscripten? – tenbits Oct 19 '15 at 19:37
  • Nope! moved to another solution. I would love to be able to get a better regex inside browser – Allen Bargi Oct 19 '15 at 20:26

2 Answers2

1

When you utilize Emscripten, the general way of building/compiling from C/C++ stays similar. The steps which change, are that you don't use e.g. the gcc compiler but Emscripten compiler.

That said there is the general question of whether you are familiar with C/C++ and more specific with autotools (which seems like the build tool Oniguruma uses). If you are not, you will probably have a very hard time understanding what needs to be done and how.

Last I checked Emscripten did not have support for Libtool, so building, utilizing autotools, will probably fail. Feel free to ask at Emscripten IRC channel though, whether this is indeed not possible.

Another way I can think of is using autotools to generate Makefiles and then writing custom targets for Emscripten programs. Beware that this is for advanced users, familiar with the make cruft.

If these steps are to taxing for you perhaps you should see whether a Javascript library can be sufficient for you.

abergmeier
  • 13,224
  • 13
  • 64
  • 120
0

A more realistic approach to do this is going to be to use http://xregexp.com. It adds many more features to RegExps and compiles them down to JavaScripts more limited RegExp dialect so it can get the best of both features and performance. Compiling a regexp library using emscripten is very unlikely to be performant enough to use in production. For some uses, emscripten is excellent, but in this case it seems like the overhead is going to be not worth the cost.

The author of XRegExp even has an article on lookbehinds http://blog.stevenlevithan.com/archives/javascript-regex-lookbehind

  • If you make claims, please back them up. Why shouldn't _Emscripten_ compiled code be as fast as "natively written" Javascript? In the end it is only Javascript to begin with. – abergmeier Nov 09 '12 at 10:31
  • 1
    Because it's not emscripten compiled code vs. native javascript. It's emscripten compiled code vs. something built into the engine and not iplemented in javascript at all. I'm talking about the builtin RegExp here, not something implemented in js. –  Nov 09 '12 at 17:40
  • then you should also add the requirement, under which your statement is true. That is - only if the RegEx can be expressed in the Javascript variant. Only then can one _assume_ it is faster. – abergmeier Nov 11 '12 at 12:40
  • I've tried XRegExp and it doesn't cut it. What I want is to parse a lot of complex regexes written originally for Ruby 1.9 in a client side chrome app environment. The size and speed does not matter but being able to parse all regexes correctly is what I'm after. – Allen Bargi Nov 15 '12 at 16:24