0
house home go https://www.monstermmorpg.com
nice hospital http://www.monstermmorpg.com 
this is incorrect url http://www.monstermmorpg.commerged 
continue

I want to extract all the urls that starts with http/https

I do try using this regex but i get nothing.

$('links').value = stringText.match("\b(?:http://|https://)\S+\b/");
000
  • 26,951
  • 10
  • 71
  • 101
MNS
  • 395
  • 1
  • 4
  • 16
  • just place `.*?` at start and at the end of your pattern `".*?\b(?:http://|https://)\S+\b/.*?"` – shift66 Jul 12 '13 at 14:40
  • Here's it "working": http://jsfiddle.net/PTFBt/ . At least it's a start – Ian Jul 12 '13 at 14:48
  • And by the way, to find `http` or `https`, you can use: `https?:\/\/` instead of your "or" with `|` – Ian Jul 12 '13 at 14:50
  • Anyone can explain why `http://www.monstermmorpg.commerged` is incorrect? – Tommi Jul 12 '13 at 14:57
  • @Tommi I'm thinking because it doesn't end with a "valid" top level domain. But at the same time, who knows what these URLs are, and what the possible top level domain possibilities are – Ian Jul 12 '13 at 15:00
  • I don't believe that validation of top-level domain is possible with regex. There are many of them, and 1 (.museum) is long enough; also your intranet admin can do local domain even longer, so invalidate domain by length is wrong. – Tommi Jul 12 '13 at 15:04
  • @Tommi There's no need to tell me. I'm not the OP, I was just trying to answer your question, which I really don't have an answer for. – Ian Jul 12 '13 at 15:08
  • Sure. I talking not personally to you, I think OP read comments as well. – Tommi Jul 12 '13 at 15:14
  • 1
    @user2463937 It does "work", but you need to explain better what you expect to happen. Here's another example: http://jsfiddle.net/PTFBt/2/ – Ian Jul 12 '13 at 16:42
  • I want to extract the urls, but i get is Null instead. – MNS Jul 12 '13 at 16:48
  • Sorry, with the last one it worked, thanks a lot for your help. – MNS Jul 12 '13 at 16:51

1 Answers1

1
    var string = "house home go https://www.monstermmorpg.com hospital "
     +" http://www.monstermmorpg.io"
     +" this is incorrect url http://www.monstermmorpg.commerged"
    .match(/\b(https?:\/\/.*?\.[a-z]{2,4}\b)/g);

// only the first two.

   ["https://www.monstermmorpg.com", "http://www.monstermmorpg.io"] 
raam86
  • 6,785
  • 2
  • 31
  • 46