0

Hi I have this regex to match ursl, but i need to match subdomains too.

public function getUrls($url){
     preg_match_all( "#(www\.|https?:\/\/){1}[a-zA-Z0-9]{2,}\.[a-zA-Z0-9]{2,}(\S*)#i",$url, $matches );
     return $matches[0];
}  

this match http://domain.com but not http://sub.domain.com

Any idea how to make it work?

greenbandit
  • 2,267
  • 5
  • 30
  • 44

1 Answers1

1

Replace [a-zA-Z0-9]{2,}\. with ([a-zA-Z0-9]{2,}\.)+. But the regex matches a lot of invalid domains, and probably won't match all valid urls either. It would be wiser to use a proper parser library for urls if the language you're using provides it.

markijbema
  • 3,985
  • 20
  • 32