0

I have a Cross-Site Scripting (XSS) issue on one of my sites. Right now I am using the following code to get each page's URL:

$pageurl = $_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];
  $pageurlencode = "http%3A%2F%2F".urlencode($pageurl);
  $pageurl = "http://".$pageurl;

But, when I place the $pageurl into my open graph url (og:url), there is the issue (because, people can inject code there).

<meta property="og:url" content="<?php echo $pageurl; ?>" />

So, my question is, how do I modify my $pageurl so malicious code doesn't get added?


I searched on StackOverflow for similar issues, but couldn't find any that address this specific issue (there were plenty of XSS, but none that pointed to fixing a get URL call). So, if you do see a duplicate, let me know where you saw it. Thanks.

Hirad Roshandel
  • 2,175
  • 5
  • 40
  • 63
Jabbamonkey
  • 272
  • 4
  • 23
  • What kind of injection are you experiencing? Anything coming through `$_SERVER` variables should already be sanitized. [See this related question.](https://stackoverflow.com/questions/25274615/how-to-inject-php-code-with-serverrequest-uri) – sheng Oct 26 '18 at 16:50
  • Someone ran a test on my site using the URL. They said that when they entered https://www.example.com/?"> into the browser, then inspecting the code showed the og:url as ... if that's true, they could place code to steal my cookies. – Jabbamonkey Oct 26 '18 at 17:06
  • 1
    Odd though... when I enter the same url, I inspect the code and see http://www.example.com/?%22%3E%3Csvg/onload=confirm(1)%3E ... which is safe. Could you tell me how to explain to him that there is no issue? And why he may be seeing the code differently? – Jabbamonkey Oct 26 '18 at 17:07

2 Answers2

1

And use always htmlspecialchars() on output for dynamic values (user input) to HTML source values.

hausl
  • 160
  • 16
  • 1
    So, just add a line at the end of my code above and say.... $pageurl = htmlspecialchars($pageurl); Is that correct? – Jabbamonkey Oct 26 '18 at 17:14
  • This way: `" />` If you use another charset then UTF-8 then set it instead of it. – hausl Oct 26 '18 at 17:48
  • Or you described it above, but just remember you changes the value of the variable if yo use it somewhere else, instead of the HTML output. htmlspecialchars() ist just for the connex swapping to(!) HTML. – hausl Oct 26 '18 at 18:06
  • ``htmlspecialchars()`` is far from a one-fits-all solution. You need to pair that with common logic, because even sanitized user input can be malicious, such is the ``javascript:`` URI scheme and a range of elements which should **NEVER** contain user input. – Cillian Collins Oct 27 '18 at 17:47
1

Your last comment was correct.

Query strings -- and any special characters, for that matter, are always encoded. The only possible way you could ever have this sort of injection was 1) if < > " etc. were valid URL characters or 2) even valid filename characters. Notice that both sets exclude any HTML characters. The only time you would ever have to worry about injection was if you were using $_GET or $_POST variables in PHP and directly outputting them -- since those are decoded into plain text. Then, you would use htmlentities to properly sanitize them. See this broader discussion on text injection.

Since you're directly accessing a valid URL string (through $_SERVER['REQUEST_URI']), you will never get a non-valid (to spec) URL.

The browser should automatically handle this encoding before sending it over to the server, and if not, the server should guard against any malformed URL. Either way, your script doesn't need to worry about handling that.

EDIT

It appears you can have characters such as < > in filenames on Mac/Linux, however those will never be properly loaded by a server since they do not follow URL guidelines.

EDIT 2

I just tested this on a Node.JS server and it appears that it does manage to serve up files with special characters. To be safe, always encode using htmlentities or htmlspecialchars as suggested on this page, however, I can never imagine someone having a filename with special HTML characters -- that's basically XSSing yourself -- but it's a good thing to be aware of.

EDIT 3

Curiosity got the best of me, and I deployed a file named "<test.php to my NGINX server running on a Linux VM. Here's partial output of print_r($_SERVER):

 [DOCUMENT_URI] => /"<test.php
 [REQUEST_URI] => /%22%3Ctest.php
 [SCRIPT_NAME] => /"<test.php

Notice REQUEST_URI is still being encoded, even though the server is properly resolving the path. So, you can ignore my last edit, assuming you stick with REQUEST_URI. To reiterate, files should never be named with special characters, so this wouldn't be an issue either way.

sheng
  • 1,226
  • 8
  • 17