0

How would I go about getting everything after the hostname in javascript?

So far this is the regex that I have but I now need to capture after starting with the first / till the end of the string.

https?\:\/\/(.*)

String

http://www.myurl.com/en/country/belgium/

So for the string I need to capture:

/en/country/belgium/

I have been toying with this example even after reading up on regex if anybody could take a couple minutes to provide me with an example that would be really nice.

Edit

To be clear I am using document.referrer here and to my knowledge this does not come with helpers like document.location does.

Stephan-v
  • 19,255
  • 31
  • 115
  • 201
  • 1
    Please try `window.location.pathname`. `window.location` provides very useful apis. You should check it – Rajesh Sep 26 '17 at 13:28
  • if you are using javascript in a browser, why don't you try with `window.location.pathname` – JavierFromMadrid Sep 26 '17 at 13:29
  • I can't because this is based on document.referrer and the the location.pathname. Somehow the formatting options you have there normally are not available. – Stephan-v Sep 26 '17 at 13:29
  • If this is in clientside code, there are other ways to get the parts of an URL -> https://stackoverflow.com/questions/46179432/why-use-anchor-href-property-to-process-url – adeneo Sep 26 '17 at 13:29
  • Not with document.referrer I am afraid. – Stephan-v Sep 26 '17 at 13:30
  • Again, any valid URL can be parsed by the browser – adeneo Sep 26 '17 at 13:31
  • See also: https://stackoverflow.com/questions/736513/how-do-i-parse-a-url-into-hostname-and-path-in-javascript – p.s.w.g Sep 26 '17 at 13:38

7 Answers7

4

You should use the URL Class instead:

var url = new URL('http://www.myurl.com/en/country/belgium/');
console.log(url.pathname); // /en/country/belgium/

url;
/*
URL {
    hash: "",
    host: "www.myurl.com",
    hostname: "www.myurl.com",
    href: "http://www.myurl.com/en/country/belgium/",
    origin: "http://www.myurl.com",
    password: "",
    pathname: "/en/country/belgium/",
    port: "",
    protocol: "http:",
    search: "",
    searchParams: URLSearchParams {},
    username: ""
}
*/

More info: https://developer.mozilla.org/en-US/docs/Web/API/URL

Adam
  • 4,985
  • 2
  • 29
  • 61
  • Thanks. This looks like the best approach. – Stephan-v Sep 26 '17 at 13:37
  • Please refer to [Browser compatibility](https://developer.mozilla.org/en-US/docs/Web/API/URL#Browser_compatibility) before using it. – Rajesh Sep 26 '17 at 13:37
  • And there go my dreams of a clean approach. – Stephan-v Sep 26 '17 at 13:40
  • Note: [Can I Use](https://caniuse.com/#feat=url) appears to suggest somewhat more broad compatibility (e.g. MDN lists Edge support as "In Development", Can I Use lists as it as fully supported in Edge since version 14). Still it's not a truly *universal* solution. – p.s.w.g Sep 26 '17 at 13:44
  • The URL class is really just syntactic sugar for creating an anchor and parsing the URL that way. You can do this easily without using the URL class, which lacks cross-browser support. – adeneo Sep 26 '17 at 13:52
2

Since you need to parse a url in string, you can use regex.

Logic:

  • Start matching with http[s]*. This will check for http and for https
  • Then check for ://
  • Now you will have to match hostname. For this, you can search for next / and accept anything after it.

var str = 'http://www.myurl.com/en/country/belgium/';
var pathNameRegex = /http[s]*:\/\/[^\/]+(\/.+)/;
var matches = str.match(pathNameRegex);
console.log(matches[1]);
Community
  • 1
  • 1
Rajesh
  • 24,354
  • 5
  • 48
  • 79
2

Use URL object.

var url = new URL("http://www.myurl.com/en/country/belgium/");
console.log(url.pathname);

UPDATE: Using anchor tag to polyfill URL (I'm not sure if this is complete polyfill for everyghing that URL does but should be enough for your task):

if (typeof URL === 'undefined') {
    var URL = function(url) {
        var a = document.createElement('a');
        a.href = url;
        return a;
    }
}

var url = new URL('https://www.example.com/pathname/');
var path = url.pathname;
Walk
  • 737
  • 4
  • 15
  • 1
    That is brilliant. Even though `document.referrer` is just a plain string I can just create a new URL object and it will chop it up neatly into the properties that `window.location` usually has. I like this approach the best. Thanks man. – Stephan-v Sep 26 '17 at 13:36
  • Browsersupport looks terrible from what I have been reading so far. I will try this out since I find this really useful though: https://www.npmjs.com/package/url-polyfill – Stephan-v Sep 26 '17 at 13:44
  • You can try simpler polyfill using anchor tag (check my edit, can't paste formatted code in a comment). – Walk Sep 27 '17 at 07:25
  • This will drop the searchParams, hash etc. For example the `new URL('https://example.com/mypath/?key=value).pathname` will be just `mypath` – Eljo George Jul 02 '21 at 07:24
1

Just create an anchor and let the browser parse it. Works everywhere

var a  = document.createElement('a');
a.href = 'http://www.myurl.com/en/country/belgium/'; // or document.referrer

var path = a.pathname;

console.log(path);
adeneo
  • 312,895
  • 29
  • 395
  • 388
0

Without regex, you can use the following:

var pathArray = location.href.split( '/' );
var protocol = pathArray[0];
var host = pathArray[2];
var baseUrl = protocol + '//' + host;
var nonBaseUrl = window.location.href.replace(baseUrl, '');
mrid
  • 5,782
  • 5
  • 28
  • 71
0

You can achieve that with a simple replace.

var url = 'http://www.myurl.com/en/country/belgium/';
var path = url.replace(/https?:\/\/[^\/]+/g,'');

console.log(path);//prints /en/country/belgium/

But if you want to capture the path you can use the same regex with a capture group

var url = 'http://www.myurl.com/en/country/belgium/';

var regex =  /https?:\/\/[^\/]+(.*)/g;
var match = regex.exec(url);
  
console.log(match[1]); //prints /en/country/belgium/
Cheloide
  • 793
  • 6
  • 21
-1

I suggest:

/https?:\/\/[^\s\/]*(\/\S*)/

[^\s\/] is a character class that excludes whitespaces and slashes.

\S is a shorthand character class that matches all characters except white spaces.

Note that : isn't a special character and doesn't need to be escaped.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125