0

I am using a regular expression to capture a URL string, but it's not working out.

Here's my code, which is coming from an external JavaScript document, linked to the HTML file:

var url = 'The URL was www.google.com';
var urlRegEx = /((\bhttps?:\/\/)|(\bwww\.))\S*/;
var urlRegMatch = url.match(urlRegEx);
document.write(urlRegMatch);

The output that I get is this: "www.google.com,www.,,www."

What am I doing wrong?

Thanks! :)

Xyce Bedet
  • 43
  • 8
  • The match() method searches a string for a match against a regular expression, and returns the matches, as an Array object. what are you trying to achieve here? – Treesa Sep 20 '18 at 03:58
  • Just a quick comment -- parens means capture, i.e., regex m/(x)/ will return "x" for the match. There are no "https:" to capture here, yet you use parens, and it captures it. You probably want parens around "https:|www.", not around "(https:)" and "(www.)" individually, etc.. – HoldOffHunger Sep 20 '18 at 03:59
  • 3
    You are getting three items because you have capture groups (between the parentheses). So the first item is the whole match the others are the capture groups. – Get Off My Lawn Sep 20 '18 at 03:59
  • @Treesa I'm just doing some testing with regular expressions, which I'm pretty new with. I'm just trying to print www.google.com to my web page from the string in the variable "url." – Xyce Bedet Sep 20 '18 at 04:03
  • 1
    i found some results here. may be you can try https://stackoverflow.com/questions/6038061/regular-expression-to-find-urls-within-a-string – Treesa Sep 20 '18 at 04:04
  • You can use this regex urlRegEx = /((([A-Za-z]{3,9}:(?:\/\/)?)(?:[\-;:&=\+\$,\w]+@)?[A-Za-z0-9\.\-]+|(?:www\.|[\-;:&=\+\$,\w]+@)[A-Za-z0-9\.\-]+)((?:\/[\+~%\/\.\w\-_]*)?\??(?:[\-\+=&;%@\.\w_]*)#?(?:[\.\!\/\\\w]*))?)/. Match function return array object. I read more here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match – DinhNguyen Sep 20 '18 at 04:08
  • created a codepen with the result https://codepen.io/treesa/pen/bxzNEb – Treesa Sep 20 '18 at 05:16

2 Answers2

1

Your regex works fine, what happens is that according to the match function you are getting one extra result for each group (defined by these parenthesis) that you have in your regex. to access the whole match you can access to the first item of the match return like the following:

var url = 'The URL was www.google.com';
var urlRegEx = /((\bhttps?:\/\/)|(\bwww\.))\S*/;
var urlRegMatch = url.match(urlRegEx)[0]; //this line changed, im using [0] to access the whole match only
document.write(urlRegMatch);

working example: https://jsfiddle.net/2sptg5rz/3/

reference about the match function: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match#Return_value

0

I'd like to thank @Dknacht and @Tressa for directing me to the correct answer, since the first value of the array, 0, is the matched text.

This code works for what I want to do:

var url = 'The URL was www.google.com';
var urlRegEx = /((\bhttps?:\/\/)|(\bwww\.))\S*/
var urlRegMatch  = url.match(urlRegEx);
document.write(urlRegMatch[0]);
Xyce Bedet
  • 43
  • 8