0

I know variations of this question have been asked many times, but none of the other solutions I've found have worked for me.

I'm simply attempting to come up with a regex to remove any single line comments that may appear in some javascript.

Ideally, it would have some way to check that two forward slashes aren't part of any actual script.

$("a[href='#top']").click(function() { // This should be removed

$('html, body').animate({
    scrollTop:0
}, 'slow');

return false;

});

var SomeURL = 'http://www.google.com'; // This comment should be removed but the URL should not

I have this:

$fileContents = preg_replace('/\/\/.*/', '', $fileContents);

...but that breaks my code because it strips out URL's along with single line comments

MultiDev
  • 10,389
  • 24
  • 81
  • 148
  • 2
    This is going to be a pain to do with regex because you're searching for a context sensitive pattern. E.G `var a = '// some string'` should not be removed, but `var a = 'some string' // comment here` should be removed. – bassxzero Oct 19 '17 at 14:27
  • the solutions proposed in [here](https://stackoverflow.com/questions/5419154/how-to-remove-single-line-comments-in-php) seems to work! – tjadli Oct 19 '17 at 14:28
  • @bassxzero I thought if the regex simply checked if a semicolon exists after `//` to ignore it. I just don't know how to write that. – MultiDev Oct 19 '17 at 14:29
  • What if the comment has a semicolon then? – Mikk3lRo Oct 19 '17 at 14:30
  • 1
    @JROB it's not that simple. `var a= 'blah';var b='blah blah'; // comment \n` can appear on one line and is syntactically valid. This is more of a job for a parser. – bassxzero Oct 19 '17 at 14:30
  • 1
    You would then catch URLs that were spread over more than one line. It's not really as simple as you think and cannot be reliably done with regular expressions. – Phylogenesis Oct 19 '17 at 14:31
  • @JROB Also, in JavaScript, semicolons are optional. – bassxzero Oct 19 '17 at 14:33

1 Answers1

0

Regex isn't complex enough to (elegantly) do this in all cases, but you can use some assumptions. For instance: Since // can only be a) a comment or b) part of a string, you should be able to do the following:

\/\/[^;)]*$

This means that there may not be any ; or ) after the comment. This however only works when you don't use those in your comment. You can of course use any character like maybe ' and/or " to better fit your needs.

SourceOverflow
  • 1,960
  • 1
  • 8
  • 22