-1

I have a line with the code

require_once(PATH_ROOT).'/calls/inumber.php'); //this is a comment<br>

I want to delete everything with SED after the //. My first try was

sed -i 's/[//].*//' file;

But that deletes everthing after (PATH.ROOT).'/
I want to remove the comment, not the PATH. Ir is not in the sample above, but how can I exclude SED, not to delete after http:// cause there are two // too.

EDIT: Ok, the quest is, to remove all One-Line-Comments that starts with at least two slashes. It doesnt matter what letters/numbers/signs follow, replace it with nothing. The only exception is http(s):// that should be skipped. Examples and results:
$a=5; //first comment
$a=5;

$b=10; ////// second comment
$b=10;

$c=15; /// /*&/$%§$%&/& third comment
$c=15;

/////////////////////////////
should be empty string

/*test comment*/
/*test comment*/ --> no change cause there are no TWO slashes

Summary: everything after // should be removed (incl the two //) except the http(s)://

noobee
  • 373
  • 1
  • 4
  • 10
  • @Xen2050 You're correct, no answers in comments. – gboffi Oct 28 '18 at 13:45
  • What language is this, where URL's (with //) and "comments" of `//////////////////` and `/// /*&/$%§$%&/&` are valid? – Xen2050 Oct 28 '18 at 19:07
  • In PHP comments start with //. It doesnt matter what follows after that //
    https://www.w3schools.com/php/showphp.asp?filename=demo_syntax_comments or for url:
    https://www.w3schools.com/php/showphp.asp?filename=demo_filter7
    – noobee Oct 28 '18 at 20:24
  • What if the slashes are inside quotes? – Lasse V. Karlsen Oct 29 '18 at 10:53
  • @noobee You know that PHP comments like that are ended with `?>` and comments can also start with `#` or `/*`, they're too complicated to all be recognized with a small sed search IMO (especially considering you can have quoted `//`'s in valid PHP). Do you really just want to remove all PHP comments? – Xen2050 Oct 29 '18 at 13:38
  • Yes I want to delete all PHP comments. First try is to remove one-line-comments. That seems not so difficult I thought. When removing the // and alle the content behind I just have to look, that there is no ":" before cause this could be an url like http://, ftp:// ... A one-line-comment with "#" ends at the end of the line I think. And a one-line-comment with"/*" ends with "*/". Multiline comments are not important till now. I think it would be more difficult to remove such comments – noobee Nov 01 '18 at 23:35

2 Answers2

1

You can use greedy nature of quantifiers to always delete only the last occurrence

$ cat ip.txt
require_once(PATH_ROOT).'/calls/inumber.php'); //this is a comment<br>
http://foo/123 //commenting stuff
a//b/c/d 1//23/4/5 //commented

$ sed 's|\(.*\)//.*|\1|' ip.txt
require_once(PATH_ROOT).'/calls/inumber.php'); 
http://foo/123 
a//b/c/d 1//23/4/5 
  • sed allows different delimiters to be used, this helps to avoid having to escape //
    • [//] is same as [/], meaning it matches a single /
  • \(.*\)//.* use capture group for portion of the line before last set of // so that you can put it back in replacement section using \1
Sundeep
  • 23,246
  • 2
  • 28
  • 103
  • Ok, that removes comments like "//this is a comment" -> but not comments like "///////I am a long comment" or comments like "// // // // I am a long comment"
    Can you do it with SED ?
    – noobee Oct 28 '18 at 15:00
  • I'd suggest that you edit the question to add all possible cases along with complete expected output for that.. the question as it stands doesn't explain these scenarios.. – Sundeep Oct 28 '18 at 15:34
  • Thx for hint. I edited my question. I hope I explained it a bit better -.- – noobee Oct 28 '18 at 16:06
0

Now that you've changed the question (a lot) here's a sed that shouldn't remove any URLs (file:// or http:// or https:// or anything:// ) - it ignores :// but otherwise deletes everything after two slashes:

sed 's|\([^:]\)//.*$|\1|'

It matches anything not a : (saving that character) followed by // and anything to the end of the line, only putting back the first non-: character.


sed 's|//[^/]*$||'

search for // then anything not a slash [^/] zero or more times * then the end of the line $, and replace it with nothing.

If you wanted to match & delete any whitespace before a comment too, you could use the whitespace character class \s

sed 's|\s*//[^/]*$||'

Note that you can't have a slash inside a comment, since that would match URL's as comments, unless you recognized and excluded URL's.


Just in case you want to keep the //'s (since you say "delete everything... after the //") you could just put them back:

sed 's|//[^/]*$|//|'

Note: To just remove all PHP comments, follow this answer Best way to automatically remove comments from PHP code

Xen2050
  • 2,452
  • 1
  • 21
  • 17
  • I edited my question above. Can you show my the sed-command for deleting everthing after // incl. the // ? THX – noobee Oct 28 '18 at 18:55