0

I have a PHP script that removes the "http://" from user input url strings.

My Script:

$url= "http://techcrunch.com/startups/";
$url = str_replace('http://', '', $url);

Result:

$url= techcrunch.com/startups/

This works great, except that sometimes urls have "https://" instead. Is there a way I can just remove everything before the domain name, no matter what it is?

  • 9
    This question has to have been answered a thousand times. http://stackoverflow.com/questions/9549866/php-regex-to-remove-http-from-string http://stackoverflow.com/questions/4875085/php-remove-http-from-link-title http://stackoverflow.com/questions/12415815/remove-http-from-url-string etc – user1477388 Dec 31 '13 at 22:34
  • 3
    http://si1.php.net/parse_url – Glavić Dec 31 '13 at 22:35
  • Not to forget https://pear.php.net/package/Net_URL2 – hakre Dec 31 '13 at 22:52
  • Remember to do a search before creating an answer. The second Google search result "remove http from url string php" showed this: http://stackoverflow.com/questions/9549866/php-regex-to-remove-http-from-string – TheYaXxE Mar 14 '14 at 10:16

5 Answers5

1

Try this out:

$url = 'http://techcrunch.com/startups/';
$url = str_replace(array('http://', 'https://'), '', $url);

EDIT:

Or, a simple way to always remove the protocol:

$url = 'https://www.google.com/';
$url = preg_replace('@^.+?\:\/\/@', '', $url);
NeoNexus DeMortis
  • 1,286
  • 10
  • 26
  • 3
    This isn't answering the question. The user said: Is there a way I can just remove everything before the domain name, no matter what it is? – Kirk Backus Dec 31 '13 at 22:34
  • This answers the problem of how to remove http and https, and gives a good example of how to quickly add more protocols. – NeoNexus DeMortis Dec 31 '13 at 22:35
  • 2
    @kirk while those are his words, the intent of the user seems to be just to remove http and https. – JAL Dec 31 '13 at 22:38
  • @JAL: Thank you, at least you saw the same thing... – NeoNexus DeMortis Dec 31 '13 at 22:43
  • @NeoNexusDeMortis I very much do see that, but being a part of this community we want to provide complete answers. Not just whatever comes to mind, and your second answer does not work on `techcrunch.com/startups/?redirect=https://test.com`. It just returns `test.com` – Kirk Backus Dec 31 '13 at 23:00
  • @KirkBackus: while I agree with what you say about complete answers, I am not going to recite the PHP manual to explain my answer. I've noticed most times all it takes is showing someone the way to correct the problem and they realize the part they missed. And yes, you are correct about my edit, I forgot a character. Additionally, I originally answered his question the way I interpreted it. Simply claiming that I didn't answer his question because you interpreted it differently, then giving me a negative vote is just poor form. – NeoNexus DeMortis Dec 31 '13 at 23:06
  • He did not give you a downvote, I did. And if I recall correctly, I issued it based on your first submission which was using string replacement. Just wanted to let you know that since you assumed it was him. – Anil Dec 31 '13 at 23:50
0

Use look behinds in preg_replace to remove anything before //.

preg_replace('(^[a-z]+:\/\/)', '', $url);

This will only replace if found in the beginning of the string, and will ignore if found later

josephtikva1
  • 789
  • 5
  • 11
0

Something like this ought to do:

$url = preg_replace("|^.+?://|", "", $url);

Removes everything up to and including the ://

J David Smith
  • 4,780
  • 1
  • 19
  • 24
0
preg_replace('/^[^:\/?]+:\/\//','',$url);

some results:

input: http://php.net/preg_replace
output: php.net/preg_replace

input: https://www.php.net/preg_replace
output: www.php.net/preg_replace

input: ftp://www.php.net/preg_replace
output: www.php.net/preg_replace

input: https://php.net/preg_replace?url=http://whatever.com
output: php.net/preg_replace?url=http://whatever.com

input: php.net/preg_replace?url=http://whatever.com
output: php.net/preg_replace?url=http://whatever.com

input: php.net?site=http://whatever.com
output: php.net?site=http://whatever.com
KorreyD
  • 1,274
  • 8
  • 15
-1
$new_website = substr($str, ($pos = strrpos($str, '//')) !== false ? $pos + 2 : 0);

This would remove everything before the '//'.

EDIT

This one is tested. Using strrpos() instead or strpos().

Eisa Adil
  • 1,743
  • 11
  • 16
  • 1
    String replacement in this fashion is not a good idea. You are assuming that all users will *always* enter URLS into the input correctly. It will only take one user to enter a URL like "http:///www.amazon.com" to break this code. Always assume users have entered invalid code, and write the most robust code possible – Anil Dec 31 '13 at 22:37
  • 1
    What if user enters url like: `techcrunch.com/startups/?redirect=https://test.com` ? – Glavić Dec 31 '13 at 22:41
  • 1
    Thanks @Glavić, another great example. – Anil Dec 31 '13 at 22:42
  • @SlyRaskal I solved your problem, now to think about Glavics'. – Eisa Adil Dec 31 '13 at 22:44
  • You are greatly missing the point! Please try to make your code work with this sample URL: http:///www.xyz.com/category//page?input=http://zing.com – Anil Dec 31 '13 at 22:46
  • Oh and the poster does not want the '//' in the outputted URL. Trust us, you aren't going to solve this problem with string replacement using those functions unless you implement some looping and conditional checks. Regex is the way to go. – Anil Dec 31 '13 at 22:47
  • 2
    Also where did you pull that 80% number from? Seriously, it's nothing but made up. – kittycat Dec 31 '13 at 22:51
  • Stop picking on me dude. – Eisa Adil Dec 31 '13 at 22:52
  • 3
    I'm not picking on you. I'm critiquing your code. That's what this site is all about. If you do not wish to listen, and think about the guidance of more experienced programmers, you're in for a world of hurt in the programming sector. – Anil Dec 31 '13 at 22:53