2

I'm working on a project that's essentially a templating domain-specific language. In my project, I accept lines of user input in the following form:

'{{index(1, 5)}}'
'{{firstName()}} X. {{lastName()}}'
'{{floating(-0.5, 0.5)}}'
'{{text(5, "words")}}'

Any command between double curly braces ({{ }}) has a corresponding Javascript method that should be called when that command is encountered. (For example, function index(min, max) {...} in the case of the first one).

I'm having a difficult time figuring out how to safely accept the input and call the appropriate function. I know that the way I'm doing it now isn't safe. I simply eval() anything between two sets of curly braces.

How can I parse these input strings such that I can flexibly match a function call between curly braces and execute that function with any parameters given, while still not blindly calling eval() with the code?

I've considered making a mapping (if command is index(), call function index() {}), but this doesn't seem very flexible; how do I collect and pass any parameters (e.g. {{index(2, 5)}}) if any are present?

This is written in Node.js.

Isaac Dontje Lindell
  • 3,246
  • 6
  • 24
  • 35
  • Are the arguments to these functions ever allowed to be calls to other functions? E.g., `{{firstName(index(1, 2))}}` – T.J. Crowder Jun 20 '14 at 11:55
  • Yes, if it's possible. – Isaac Dontje Lindell Jun 20 '14 at 11:57
  • Then you can't do it with regular expressions alone, you need a parser. (You are, of course, quite right **not** to use `eval` as it allows arbitrary code execution.) http://pegjs.majda.cz/online is one tool you can use to create a parser for a DSL. I would also use that dispatch map you talked about, it's flexible enough. Once you know what function to call and what arguments to give it, you can use `func.apply(thisArg, argsArray)` to do so. – T.J. Crowder Jun 20 '14 at 11:59

3 Answers3

2

This problem breaks down into:

  1. Parsing the string

  2. Evaluating the resulting function graph

  3. Dispatching to each function (as part of #2 above)

Parsing the string

Unfortunately, with the requirements you have, parsing the {{...}} string is quite complex. You have at least these issues to deal with:

  1. Functions can be nested {{function1(function2(), 2, 3)}}.

  2. Strings can contain (escaped) quotes, and can contain commas, so even without requirement #1 above the trivial approach to finding the discrete arguments (splitting on a comma) won't work.

So...you need a proper parser. You could try to cobble one together ad hoc, but this is where parser generators come into the picture, like PEG.js or Jison (those are just examples, not necessarily recommendations — I did happen to notice one of the Jison examples is a JSON parser, which would be about half the battle). Writing a parser is out of scope for answering a question on SO I'm afraid. :-)

Evaluating the resulting function graph

Depending on what tool you use, your parser generator may handle this for you. (I'm pretty sure PEG.js and Jison both would, for instance.)

If not, then after parsing you'll presumably end up with an object graph of some sort, which gives you the functions and their arguments (which might be functions with arguments...which might be...).

  • functionA
    • 1
    • "two"
    • functionB
      • "a"
      • functionC
        • 42
    • functionD
    • 27

functionA there has five arguments, the third of which is functionB with two arguments, and so on.

Your next task, then, is to evaluate those functions deepest first (and at the same depth, left-to-right) and replace them in the relevant arguments list with their result, so you'll need a depth-first traversal algorithm. By deepest first and left-to-right (top-to-bottom in the bullet list above) I mean that in the list above, you have to call functionC first, then functionB, then functionD, and finally functionA.

Dispatching to each function

Depending again on the tool you use, it may handle this bit too. Again I suspect PEG.js does, and I wouldn't be surprised if Jison did as well.

At the point where you're ready to call a function that (no longer) has function calls as arguments, you'll presumably have the function name and an array of arguments. Assuming you store your functions in a map:

var functions = {
    index: function() { /* ... */ },
    firstName: function() { /* ... */ },
    // ...
};

...calling them is the easy bit:

functionResult = functions[functionName].apply(undefined, functionArguments);

I'm sorry not to be able to say "Just do X, and you're there," but it really isn't a trivial problem. I would throw tools at it, I wouldn't invent this wheel myself.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • 1
    This is a useful answer, thanks. I originally went down the Jison route, but I don't have much experience w/ formally specifying grammars, so I got bogged down pretty quickly (and then decided that maybe I was over thinking it). I guess I'll give it another stab. I'd looked at some of the simple Jison examples, but I missed that JSON linter. Maybe that'll help. Again, thanks! – Isaac Dontje Lindell Jun 20 '14 at 14:17
1
  1. If possible do not evaluate the user input.
  2. If you need to evaluate it, evaluate it in controlled scope and environment.

The last one means instead of using eval() use new Function() or specially designed libraries like https://github.com/dtao/lemming.js

See http://www.2ality.com/2014/01/eval.html for more information about eval vs new Function()


For more sophisticated approach try creating your own parser, check https://stackoverflow.com/a/2630085/481422

Search for comment // ECMAScript parser in https://github.com/douglascrockford/JSLint/blob/master/jslint.js

Community
  • 1
  • 1
Zlatin Zlatev
  • 3,034
  • 1
  • 24
  • 32
0

You could try something like this:

Assuming you have a function like this:

'{{floating(-0.5, 0.5)}}'

And all your actual functions are referenced in an object, like this:

var myFunctions = {
    'index': function(){/* Do stuff */},
    'firstName': function(){}
}

Then, this should work:

function parse(var input){
    var temp = input.replace('{{','').replace(')}}','').split('('),
        fn = temp[0];
        arguments = temp[1].split(',');
    myFunctions[fn].apply(this, arguments);
}

Please note that this only works for simple function calls that don't have functions nested as their arguments. It also passes all arguments as strings, instead of the types that may be intended (Numbers, booleans, etc).

If you want to handle more complex strings, you'll need to use a proper parser or template engine, as @T.J. Crowder suggested in the comments.

Community
  • 1
  • 1
Cerbrus
  • 70,800
  • 18
  • 132
  • 147
  • See his answer to my question in the comments. – T.J. Crowder Jun 20 '14 at 12:00
  • ... Damnit. That complicates things. – Cerbrus Jun 20 '14 at 12:01
  • You're also passing a string as the second argument to `apply`, and turning that string into an array of arguments is **non-trivial**, as you have to deal with strings...which might have quotes and commas in them. – T.J. Crowder Jun 20 '14 at 12:08
  • I forgot the split to split up the arguments. However, this will break horrendously if the arguments contain comma's, and the arguments are passed as strings instead of their intended types, as I mentioned in the disclaimer. @Isaac should probably consider this a proof-of-concept, and no more than that. – Cerbrus Jun 20 '14 at 12:11