-1

I have a requirement to replace all Url prefixes from the string

Example Inputs

1)lorum ipsum www.goal.com

2)lorum ipsum http://www.goal.com

3)lorum ipsum https://www.goal.com

4)lorum ipsum https://www.goal.com/1234

5)lorum ipsum www.

Expected o/p

1)lorum ipsum goal.com

2)lorum ipsum goal.com

3)lorum ipsum goal.com

4)lorum ipsum goal.com/1234

5)lorum ipsum www.

So far i have this

function removeLinks($url) {

      //$url=preg_replace ("~^www\.~", "", $url);
      $disallowed = array('http://', 'https://','www.');
      foreach($disallowed as $d) {
            $url=str_replace($d, '', $url);

            
      }
      return $url;
   }

echo removeLinks('lorum ipsum http://www.goal.com and https://www.goal.com abd https://www.goal.com/4234234 and www.goal.com and not remove www.');

But this also remove www at the end which I don't want to be removed,any possible workaround to fix the problem

  • 2
    Does this answer your question? [Remove URL Prefix from String (http:/, www, etc.)](https://stackoverflow.com/questions/16673628/remove-url-prefix-from-string-http-www-etc) – anotherGatsby Mar 27 '22 at 08:36
  • Sorry,No That didn't help – Soju Varughese Mar 27 '22 at 08:48
  • You can simplify the selected answer on that question to meet your need. Anyways I have posted the simplified version. Check if it works. – anotherGatsby Mar 27 '22 at 09:28
  • Please describe why `www.` should not be removed in case 5. Is it because it is followed by the end of the string? Is it because it is not followed by at least one non-space character? You need to give the exact criteria for removal. Examples alone do not cut it. – MikeM Mar 27 '22 at 09:53
  • because it's not part of url – Soju Varughese Mar 27 '22 at 09:56
  • Testing for a valid url requires a very long and complicated regex. What is your criteria for determining whether `www.` is part of an url? – MikeM Mar 27 '22 at 10:07
  • it contains valid tld – Soju Varughese Mar 27 '22 at 10:11
  • So you only want to remove `www.` if it is part of an url with a valid top-level domain? Again, you have to decide how you want to determine that `www.` is part of an url. See [What is a good regular expression to match a URL?](https://stackoverflow.com/questions/3809401/what-is-a-good-regular-expression-to-match-a-url) for example. Until you make the choice of how strict you want to be in determining an url, it is difficult to give a proper regex solution here. – MikeM Mar 27 '22 at 11:21

2 Answers2

0

Try this: (http[s]?:\/\/)?www\.(?=.+)

Test regex here: https://regex101.com/r/fjScQn/2

anotherGatsby
  • 1,568
  • 10
  • 21
0
function getCleanURL($url) {
    $pattern = '#^(http(s)?://)?w{3}.#';
    if($url == 'www.') {
        return $url;
    }
    return preg_replace($pattern, '', $url);
}


echo getCleanURL('www.goal.com'); // goal.com
echo getCleanURL('http://www.goal.com'); // goal.com
echo getCleanURL('https://www.goal.com'); // goal.com
echo getCleanURL('https://www.goal.com/1234'); // goal.com
echo getCleanURL('www.'); // www.
Biswajit Biswas
  • 859
  • 9
  • 20