1

I try to create a script that will remove any // and /**/ comments from JS files.

The script is currently consists of two regular expressions that are executed one after another. The problem is that it breaks in one specific case and I don't currently know how to fix it.

var input = `
//
// Line comments
//

// aaa

alert("// xxx");

alert(x); // bbb

// ccc

//
// Block comments
//

/*
 * aaa
 * bbb
 */

/* ccc */

/**/ alert(x); /**/

alert("/* xxx */");

alert('/* yyy */');
`;

// removing line comments
input = input.replace(/("[^"]*\/\/.*?")|\/\/(?:.|\r?\n)*?(?:\r?\n|.*)?/g, "$1");

// removing block comments
input = input.replace(/(["'][^"']*\/\*.*?\*\/[^"']*["'])|\/\*(?:.|\r?\n)*?\*\//g, "$1");

console.log(input);

The output from the snippet above is wrong. The line that causes the problem is

alert("// xxx");

The second regex treats the closing quotation mark in that line as the opening one and starts processing from this point. Here is a live demo: https://regexr.com/5n27q

How to fix it?

Credits:

john c. j.
  • 725
  • 5
  • 28
  • 81

1 Answers1

2

Either:

  1. Don't reinvent the wheel and utilize a package like gulp-strip-comments that already does this.
  2. Or look at the source code of gulp-strip-comments and see how they match every possible form of comment from multiline, to inline, to minified etc. (Source code)

Gulp is your friend if you're trying to "transpile" your JavaScript files. (:

Rikki
  • 3,338
  • 1
  • 22
  • 34
  • I'm not familiar with Node and software that is based on it. I see that there are only two JS files in that repository, [index](https://github.com/RnbWd/gulp-strip-comments/blob/master/index.js) and [mainSpec](https://github.com/RnbWd/gulp-strip-comments/blob/master/test/mainSpec.js), and it seems they don't contain any regular expressions. If you or someone else knows *where exactly* I should see, I would be appreciated for this information. – john c. j. Feb 22 '21 at 20:01
  • 2
    The original component used in `gulp-strip-comments` seems to be `decomment` package: https://github.com/vitaly-t/decomment They don't use much of RegEx - and that was my point about how you may not want to do that. You could use decomment in your browser's Javascript - there should be no issues. RE: Node - it runs on V8, which runs WebKit - and WebKit is what Chromium is based on (you see WebKit everytime you go to Developer Tools of Chrome for example.) Same platform, just different usage & use case. – Rikki Feb 22 '21 at 20:06