0

I need to fetch particular function and its body as a text from the javascript file and print that function as an output using C#. I need to give function name and js file as an input parameter. I tried using regex but couldnt achieved the desired result. Here is the code of regex.

public void getFunction(string jstext, string functionname)
{
    Regex regex = new Regex(@"function\s+" + functionname + @"\s*\(.*\)\s*\{");
    Match match = regex.Match(jstext);
}

Is there any other way I can do this?

Alessio Cantarella
  • 5,077
  • 3
  • 27
  • 34
  • I don't think you'll find javascript to be *regular* enough. I can't think of a way to do this without keeping track of the number of opened and closed brackets, but then you'll also need to take into account if an unmatched `}` appears within a string or a javascript regex. Unless you're willing to make heaps of assumptions about the function, you should probably look into [existing javascript parsers for .net](http://stackoverflow.com/questions/14355910/javascript-parser-and-analyzer-in-c-sharp-net-4-5). – David Hedlund Sep 21 '16 at 09:07
  • define: `couldnt achieved the desired result.` – Jamiec Sep 21 '16 at 09:07
  • Do you need to support functions like `var functionName = function(x) { ... }`? What about `window['function' + name] = function(x) { ... }`? – David Hedlund Sep 21 '16 at 09:09
  • not to mention `(function(x) { ... })("foo")` - ie, IIFE – Jamiec Sep 21 '16 at 09:11
  • I assumed anonymous functions were off the table as `getFunction` accepts a function name. – David Hedlund Sep 21 '16 at 09:12
  • 1
    ok then - `var myFunc = (function(x){ return function() { ... })("foo");` is a named function `myFunc` but you wont find it with anything like the regex above;) – Jamiec Sep 21 '16 at 09:13
  • No just need to find like function setMyName(x) { .... } or may be with no parameters function setMyName() { .... } – hassaan afridi Sep 21 '16 at 09:15
  • `function (\w+).*?\{(.*)\}` works for me. – ThePerplexedOne Sep 21 '16 at 09:16
  • @ThePerplexedOne it doesnt work! – hassaan afridi Sep 21 '16 at 09:23
  • @hassaanafridi Set the modifier to singleline. – ThePerplexedOne Sep 21 '16 at 09:24
  • @ThePerplexedOne: How is your expression supposed to know which `}` terminates the function? – David Hedlund Sep 21 '16 at 12:30

1 Answers1

1

This answer is based on the assumption which you provide in comments, that the C# function needs only to find function declarations, and not any form of function expressions.

As I point out in comments, javascript is too complex to be efficiently expressed in a regular expression. The only way to know you've reached the end of the function is when the brackets all match up, and given that, you still need to take escape characters, comments, and strings into account.

The only way I can think of to achieve this, is to actually iterate through every single character, from the start of your function body, until the brackets match up, and keep track of anything odd that comes along.

Such a solution is never going to be very pretty. I've pieced together an example of how it might work, but knowing how javascript is riddled with little quirks and pitfalls, I am convinced there are many corner cases not considered here. I'm also sure it could be made a bit tidier.

From my first experiments, the following should handle escape characters, multi- and single line comments, strings that are delimited by ", ' or `, and regular expressions (i.e. delimited by /).

This should get you pretty far, although I'm intrigued to see what exceptions people can come up with in comments:

private static string GetFunction(string jstext, string functionname) {

    var start = Regex.Match(jstext, @"function\s+" + functionname + @"\s*\([^)]*\)\s*{");

    if(!start.Success) {
        throw new Exception("Function not found: " + functionname);     
    }

    StringBuilder sb = new StringBuilder(start.Value);
    jstext = jstext.Substring(start.Index + start.Value.Length);
    var brackets = 1;
    var i = 0;

    var delimiters = "`/'\"";
    string currentDelimiter = null;

    var isEscape = false;
    var isComment = false;
    var isMultilineComment = false;

    while(brackets > 0 && i < jstext.Length) {
        var c = jstext[i].ToString();
        var wasEscape = isEscape;

        if(isComment || !isEscape)
        {
            if(c == @"\") {
                // Found escape symbol.
                isEscape = true;
            } else if(i > 0 && !isComment && (c == "*" || c == "/") && jstext[i-1] == '/') {
                // Found start of a comment block
                isComment = true;
                isMultilineComment = c == "*";
            } else if(c == "\n" && isComment && !isMultilineComment) {
                // Found termination of singline line comment
                isComment = false;
            } else if(isMultilineComment && c == "/" && jstext[i-1] == '*') {
                // Found termination of multiline comment
                isComment = false;
                isMultilineComment = false;
            } else if(delimiters.Contains(c)) {
                // Found a string or regex delimiter
                currentDelimiter = (currentDelimiter == c) ? null : currentDelimiter ?? c;
            }

            // The current symbol doesn't appear to be commented out, escaped or in a string
            // If it is a bracket, we should treat it as one
            if(currentDelimiter == null && !isComment) {
                if(c == "{") {
                    brackets++;
                }
                if(c == "}") {
                    brackets--;
                }
            }

        }

        sb.Append(c);
        i++;

        if(wasEscape) isEscape = false;
    }


    return sb.ToString();
}

Demo

David Hedlund
  • 128,221
  • 31
  • 203
  • 222
  • Thanks for the code, been trying to get some solution to parse js file and get all entries, but your message is the only solution I have found in whole web, weird, but Thanks again. – SkyDancer Jan 04 '22 at 16:50