11

Possible Duplicate:
How to get function parameter names/values dynamically from javascript

I'm currently working on a project in javascript (node.js) that has me trying to get an array of parameter names (NOT values, I do not need arguments) from a function. I'm currently using Function.toString() to get the function string and then running a regex against that to get my parameter list.

Let's take the following SIMPLE example:

var myFunction = function (paramOne, paramTwo) { ... }

Running my regex against this, and then doing some string magic (split, etc) I would expect an array back like this:

paramList = ['paramOne', 'paramTwo']

I have something that works but I'm feeling like it's probably not the best solution given some of the funky characters javascript lets you use for variable names and that javascript will let you define functions on multiple lines.

Here is what I currently have:

function.*[\w\s$]*(\((.*[\w\s,$]*)\))

This gives me my "match" in group 1 and then my param list without parens in group 2, which is cool. Is this really the best way to do what I want? Is there a better regular expression I could use for this? I'm not really looking for something "simpler" but really just something that could catch all possible situations.

Any help would be appreciated, and many thanks in advance!

Community
  • 1
  • 1
Jason L.
  • 2,464
  • 1
  • 23
  • 41

3 Answers3

24

Preface: By far, the best way to handle this is to use a JavaScript parser rather than trying to do it with a single regular expression. Regular expressions can be part of a parser, but no one regular expression can do the work of a parser. JavaScript's syntax (like that of most programming languages) is far too complex and context-sensitive to be handled with a simple regular expression or two. There are several open source JavaScript parsers written in JavaScript. I strongly recommend using one of those, not what's below.


The easiest thing would be to capture everything in the first set of parens, and then use split(/\s*,\s*/) to get the array.

E.g.:

var str = "function(   one  ,\ntwo,three   ,   four   ) { laksjdfl akjsdflkasjdfl }";
var args = /\(\s*([^)]+?)\s*\)/.exec(str);
if (args[1]) {
  args = args[1].split(/\s*,\s*/);
}
console.log("args: ", args);

How the above works:

  1. We use /\( *([^)]+?) *\)/ to match the first opening parenthesis (\( since ( is special in regexes), followed by any amount of optional whitespace, followed by a capture group capturing everything but a closing parenthesis (but non-greedy), followed by any amount of optional whitespace, followed by the closing ).

  2. If we succeed, we split using /\s*,\s*/, which means we split on sequences which are zero or more whitespace characters (\s*) followed by a comma followed by zero or more whitespace characters (this whitespace thing is why the args in my example function are so weird).

As you can see from the example, this handles leading whitespace (after the ( and before the first argument), whitespace around the commas, and trailing whitespace — including line breaks. It does not try to handle comments within the argument list, which would markedly complicate things.

Note: The above doesn't handle ES2015's default parameter values, which can be any arbitrary expression, including an expression containing a ) — which breaks the regex above by stopping its search early:

var str = "function(   one  ,\ntwo = getDefaultForTwo(),three   ,   four   ) { laksjdfl akjsdflkasjdfl }";
var args = /\(\s*([^)]+?)\s*\)/.exec(str);
if (args[1]) {
  args = args[1].split(/\s*,\s*/);
}
console.log("args: ", args);

Which brings us full circle to: Use a JavaScript parser. :-)

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • How do you think I should go about that? That's what I'm trying to do with my regex, but maybe there is a different approach? Are you thinking I could just indexOf "(" and ")" and then split what's in between? Would that be better than a crazy regex? – Jason L. Dec 19 '12 at 12:54
  • @JasonL.: I added some code to the answer (and a live example). – T.J. Crowder Dec 19 '12 at 12:59
  • Thanks! Your solution is much more simpler than I think I was expecting. I'm currently trying it out with all the different variations javascript allows for. Will this capture any of the crazy special characters javascript allows (one of the most famous being the look of disapproval)? I should also note that this is for the node.js environment ONLY, so I really only need to worry about V8 :) – Jason L. Dec 19 '12 at 13:02
  • @JasonL.: The key bits in the above is are 1) the contents of the capture group in the first regular expression, `[^)]+`, which means "one or more characters that aren't a closing parenthesis", and 2) the regular expression in the split `/\s*,\s*/`, which splits on sequences of zero-or-more whitespace chars followed by a comma followed by zero-or-more whitespace chars. So you should be fine with all of the weird and wonderful valid identifiers. It's comments you have to worry about. :-) – T.J. Crowder Dec 19 '12 at 13:05
  • Ahh, good point. And it looks like V8 does include comments. I suppose I should look into the source you linked and see how they ignore the comments :) – Jason L. Dec 19 '12 at 13:08
  • @JasonL.: You've surprised me about V8. Huh. Yeah, probably worth taking a look, but be warned that Prototype tended to be about handling the 95% case and not worrying too much about the 5% case. But still, I seem to recall discussion around that when the bug was reported a few years back and people seemed generally impressed with the resulting solution. – T.J. Crowder Dec 19 '12 at 13:15
  • I've marked this as correct because, honestly, I'm impressed by the simplicity of it. After doing some quick testing in jsbin it seems this solution works for all my cases, even those with comments with function definitions in them. I'm guessing this is because V8 ignores comments OUTSIDE the function and therefore the function definition is always first, and this regex will always match the first opening paren :) Thanks a bunch, man! I should note that I'll never have comments in my argument list. While this solution will pick those up, they are not a case I need to worry about. – Jason L. Dec 19 '12 at 13:18
  • @JasonL.: Ah, sorry, it was comments *in* the argument list I was talking about. (I hadn't even thought of comments above the function, but I can see why V8 doesn't call them part of it.) Glad that helped! – T.J. Crowder Dec 19 '12 at 13:47
  • this doesnt cover function in function – Varun Feb 03 '21 at 13:41
  • @Varun - If you want the parameter names from the outer function, yes, it does; it works just fine on `function( one ,\ntwo,three , four ) { function foo(five, six) { } }`, for instance. But it doesn't handle ES2015's default parameter values (because the answer was written in 2012). I've flagged that up in the answer, and added the preface that should always have been there -- if you want to parse JavaScript code, you need to use a proper parser. :-) – T.J. Crowder Feb 03 '21 at 14:20
4

Do as following:

var ar = str.match(/\((.*?)\)/);
if (ar) {
  var result = ar[0].split(",");
}

Remember ? after * does a non greedy find

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
closure
  • 7,412
  • 1
  • 23
  • 23
2

Let me suggest you using regular expressions:

  • [match] /function[^(]*\(([^)]*)\)/ will match the argument list
  • [split] /\W+/ (against the results of the first match data) will split the match into params list

So, the code should look like this:

var s = "function moo (paramOne, paramTwo) { alert('hello'); }";
var s2 = s.match(/function[^(]*\(([^)]*)\)/)[1];
var paramList = s2.split(/\W+/);
shybovycha
  • 11,556
  • 6
  • 52
  • 82