-1

my code works, i just want to know if its a bad practise, because i suppose so. Ive tryed all the preg_replace but it didnt seem to work. So i just wrote it like this.

As an imput I expect url

google.com www.google.com http://google.com

or

http://www.google.com

as a result I need

google.com

my code:

 $website = trim($website); //removes space characters
                        $website = trim($website, '/');
                        $website = trim($website, 'http://');
                        $website = trim($website, 'www.');
rene
  • 41,474
  • 78
  • 114
  • 152
user1505027
  • 323
  • 2
  • 8
  • 14
  • So you want to strip the whole `http://www.` from it? – putvande Dec 30 '13 at 15:51
  • http://stackoverflow.com/questions/6738752/regex-for-dropping-http-and-www-from-urls – aebersold Dec 30 '13 at 15:51
  • Contrary to popular believe (esp. among Management) the `www.` string is not the standard protocol prefix for web sites. That'd be `http:` and `https:`. Stripping www blindly can eventually just break the URL – Álvaro González Jan 03 '17 at 12:45

3 Answers3

5

The way trim works is that it trims each individual character (www. is the same as .w).

You're looking for preg_replace with a regex of ^(https?://)?(www\.)?:

$website = preg_replace('~^(https?://)?(www\.)?~i', '', $website);

Regular expression visualization

Debuggex Demo

Autopsy:

  • ^ the match MUST start with whatever comes after this (makes sure that we only replace if the match is in the start)
  • (https?://)?
    • http - the literal string http
    • s? - an optional s (in case we use https)
    • :// - the literal string ://
    • ? - makes the whole thing optional
  • (www\.)?
    • www\. - the literal string www. (you need to escape the . to \. as . means "any character")
    • ? - makes the whole thing optional
  • i - this is the modifier, and i makes the whole thing in case sensitive (will match HTTP and http)

Regex 101 Demo

h2ooooooo
  • 39,111
  • 8
  • 68
  • 102
2

KIS: Keep It Simple.

http://www.php.net/parse_url

From the docs:

<?php
$url = 'http://username:password@hostname/path?arg=value#anchor';

print_r(parse_url($url));

echo parse_url($url, PHP_URL_PATH);
?>

Array
(
    [scheme] => http
    [host] => hostname
    [user] => username
    [pass] => password
    [path] => /path
    [query] => arg=value
    [fragment] => anchor
)

EDIT: PHP Getting Domain Name From Subdomain When you have the host.

Community
  • 1
  • 1
Anyone
  • 2,814
  • 1
  • 22
  • 27
  • Even if you used `parse_url($url, PHP_URL_HOST)` (I have no clue why you use `path`), you'd still get `www.google.com` which is not what OP wants. [DEMO](http://codepad.org/ETWsQ4yQ). – h2ooooooo Dec 30 '13 at 16:00
  • But it's a lot easier to obtain the actual domain from this. – Anyone Dec 30 '13 at 16:08
0

Classic use case for RegEx. This snippet removes http(s) and www prefixes.

$new_url = preg_replace('/(?:https?://)?(?:www.)?(.*)/?$/i', '$1', $url);

aebersold
  • 11,286
  • 2
  • 20
  • 29