82

I have run into an odd situation. I'm writing a JavaScript bookmarklet that will allow users to click and share external websites to our website very easily and quickly. It simply get's the title, page URL, and if they've selected any text on the page, it grabs it too.

The problem is it doesn't work with external domains for some reason, so if we use it internally we end up with a share window with the URL formatted like this:

http://internaldomain.com/sharetool.php?shareid=http://internaldomain.com/anotheroddpage.html&title=....

That works just fine, BUT if we try to use an external domain and end up with a URL formatted like this:

http://internaldomain.com/sharetool.php?shareid=http://externaldomain.com/coolpagetoshare.html&title=...

Then we get a Forbidden Error on our page and can't load it... If we manually remove the http:// from the externaldomain address, it loads just fine again.

So.. I'm thinking the best solution to get around this problem is to modify the JavaScript bookmarklet to remove the http as it's loading the window. Here is how my current bookmarklet looks:

javascript:var d=document,w=window,e=w.getSelection,k=d.getSelection,x=d.selection,s=(e?e():(k)?k():(x?x.createRange().text:0)),f='http://internaldomain.com/sharetool.php',l=d.location,e=encodeURIComponent,u=f+'?u='+e(l.href)+

As you can see, e(l.href) is where the URL is passed.

How can I modify that so it removes the external domains http://?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Peter
  • 3,144
  • 11
  • 37
  • 56

5 Answers5

222

I think it would be better to take into account all possible protocols.

result = url.replace(/(^\w+:|^)\/\//, '');
Forivin
  • 14,780
  • 27
  • 106
  • 199
FailedDev
  • 26,680
  • 9
  • 53
  • 73
  • That worked like a charm '+e(l.href.replace(/.*?:\/\//g, "") – Peter Nov 21 '11 at 01:32
  • 7
    This is a very poor regex. .*? - means ungreedy match, but /g modifier forces the expression to be applied many times (i.e. cut all found protocols?). Also expression has no ^ to match the start. Better one: /^.*?:\/\// – disjunction Mar 17 '14 at 15:22
  • 6
    @disjunction Even ignoring your comments, that was precisely why this regex was written like this, as THIS is clearly stated in the answer. – FailedDev Mar 18 '14 at 09:41
  • 2
    Please note that in real web pages relative protocol `//` is a common practice https://www.paulirish.com/2010/the-protocol-relative-url/. So I suggest regexp `/^\/\/|^.*?:\/\//` (you can make it better I'm sure) – Dan Nov 07 '16 at 10:19
  • @Dan, good call! So let's take it even further and make this work with 'mailto:' with this edit: `.replace(/^\/\/|^.*?:(\/\/)?/, '');` – gdibble Aug 24 '17 at 19:11
  • still does not support a case of '//:www.google.com' – Vad.Gut Jun 11 '18 at 10:58
  • One question, why do we need a alternation in this regex? why not just `(^\w+:|^)` ? – Rahul Aug 02 '18 at 14:06
  • Umm... // denotes the start of a comment in js. So `//, '');` and everything after it in the line becomes a comment. – user756659 Feb 20 '19 at 19:44
59
url = url.replace(/^https?:\/\//, '')
Darryl Hein
  • 142,451
  • 95
  • 218
  • 261
matsko
  • 21,895
  • 21
  • 102
  • 144
7
l.href.replace(/^http:\/\//, '')
shyam
  • 9,134
  • 4
  • 29
  • 44
2

I think the regular expression you need is /(?:http://)(.*)/i. The first match of this should be it.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
idanzalz
  • 1,740
  • 1
  • 11
  • 18
-6

Try using replace function

var url = url.replace("http%3A%2F%2F", "");
Ram
  • 3,092
  • 10
  • 40
  • 56
  • This is unideal for the lack of Regular Expression usage. With simple text-replacement like this, you would need to chain several `.replace()` function calls to accomodate all the different variations needed (http/https/ etcetera..) – gdibble Aug 24 '17 at 19:13