0

I'm struggling to get a javascript regex to work on some edge cases.

The goal is to take a string with some js code and rewrite a function, using the original arguments. A simple case is: 'apples("a", 1, {value: 2})' would become 'bananas("a", 1, {value: 2})'.

However it's a bit more complicated when you deal with real code. For example replacing a function for a "promisified" version of that function with nested functions, and return statements, and multiline objects:

string = 'function(){
  return Request({
    json:true,
    url: "http://google.com"
    method: "GET"
  })
}'

In this case the result I expect is:

string = 'function(){
  return newPromiseFunction({
    json:true,
    url: "http://google.com"
    method: "GET"
  }).then(function( res ){ console.log(res) })
}'

So my best attempt is:

var substring = string.match(/\return\sRequest+\(.+?\)/g );
var argsString = substring[0].match( /\((.*?)\)/ );
var result  = 'return newPromiseFunction('+ argsString + ').then(function( res ){ console.log(res) })}'

However this code only works if there are no line breaks nor tabs in the argument. But a multiline object (as above) will fail.

Is there a simple fix to my regex above? Is there a simpler way to do it? Is regex even the best way to approach this?

melpomene
  • 84,125
  • 8
  • 85
  • 148
LPG
  • 822
  • 1
  • 11
  • 20
  • 1
    Why are you trying to parse code as strings? – Heretic Monkey Jun 24 '16 at 18:23
  • 1
    Regex is not powerful enough to parse JavaScript code in general. If you have a very specific set of strings that you need to replace, you may be able to use regex. If you want to use regex you're going to have to say what all your cases are specifically. – Rose Kunkel Jun 24 '16 at 18:25
  • 2
    There are several JavaScript parsers out there already which are complete -- dealing with the Abstract Symbol Tree (AST) might be easier than parsing yourself. Here is one: https://github.com/ternjs/acorn – Jeremy J Starcher Jun 24 '16 at 18:26
  • @MikeMcCaughan I have a database of js code that is stored as strings, and some manipulation needs to be done on these strings. I thought dealing with AST was too heavy for this case, but perhaps that's the best way. – LPG Jun 24 '16 at 18:39
  • Yeah, I think it probably is. Just like [using regex to parse HTML](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) is fine if you're only looking at, say, `img` tags and trying to get a single attribute, but if you want to do anything complicated, using an HTML parser is by far the better choice. – Heretic Monkey Jun 24 '16 at 18:56
  • @JeremyJStarcher Thanks. I'm trying this on acorn. It seems a bit too heavy for this case. For example: I would have to loop through all the expressions in the body, and find all the arguments and reassemble all the object properties with the key and value. Or use the start and end position for each argument. Truly what I need is to just replace the content within the parens of 'return Request({...}). Any ideas on making the regex work. – LPG Jun 24 '16 at 19:15
  • 2
    Could you give some context in your question? Maybe there are other solutions. What will you do with the modified JS string? Will you execute it, or just write it back to the database? Do all JS strings represent functions, in the sense that they are of the format `function .... (...) { ... }`? Or what else is common to all of them? – trincot Jun 24 '16 at 20:19
  • Use a proper language parser and manipulate the syntax tree they give you. There are Javascript parsers written in Javascript, prime example is http://esprima.org/. **Do not try to use regex for this.** – Tomalak Jun 25 '16 at 13:09

2 Answers2

2

This seems to work :

var codestring = `function(){ 
   return Request({  
    json:true,  
    url: "http://google.com" 
    method: "GET" 
  }) 
}`;

var reg = /return\s+Request\((\{[\S\s]*?\})\)/g;
var result = codestring.replace(reg, "return newPromiseFunction($1).then(function( res ){ console.log(res) })");

alert(result);

Note that back-ticks were used to create a multiline string for the codestring.
Might need to change that if your parser isn't ES6 ready.

And instead of a match & string concatination, it's now an instant replace.
But that's just a choice really. The other method can be used instead with same result.

Also, [\S\s]*? was used instead of .*? because . doesn't cover the linebreaks.

LukStorms
  • 28,916
  • 5
  • 31
  • 45
2

Here you are:

var code = `function() {
  return Request({
    json: true,
    url: "http://google.com"
    method: "GET"
  });
}`

var result = code.replace(/return\s+Request\s*\(([^\)]*)\)/g,
  (m, g1) => 'return newPromiseFunction('+ g1 + ').then(function( res ){ console.log(res) })}'
);

logs.innerText = result;
<pre id="logs"></pre>
Tamas Hegedus
  • 28,755
  • 12
  • 63
  • 97