PHP remove weird backslashes from url

Question

i'm working on a small scraper for fun and when I grab some image urls from certain sites they come back really weird.

For example:

scraped url:

https:\/\/cdn1.vox-cdn.com\/thumbor\/zN9XawbQJgFPkuAcA2JEGgqApm8=\/cdn0.vox-cdn.com\/uploads\/chorus_asset\/file\/3700712\/tomorrowland54fdf04f23efb_2040.0.jpg

desired url:

https://cdn1.vox-cdn.com/thumbor/zN9XawbQJgFPkuAcA2JEGgqApm8=/cdn0.vox-cdn.com/uploads/chorus_asset/file/3700712/tomorrowland54fdf04f23efb_2040.0.jpg

it's adding unnecessary backslashes, so that url doesn't work when you follow it, it gives an error.

I tried using the stripslashes function as it seems like that's it's purpose but it didn't work. The url just stayed the same.

(edit) here's the code i'm using to grab urls:

function GetImages($page_dom) {
        $found_links = [];

        $images = $page_dom->getElementsByTagName('img');
        foreach ($images as $image) {
            $img_src = $image->getAttribute('src');
            $found_links[] = $img_src;
        }

        return $found_links;
    }

It sounds like you're picking up URLs that are in JSON strings. — Barmar, May 19 '15 at 00:49
@Barmar and i'm not getting the urls from JSON, but I am outputting the data as JSON, is that the issue? — Scott, May 19 '15 at 00:56
Yes, `json_encode` escapes slashes by default. You can disable it with the `JSON_UNESCAPED_SLASHES` option. See http://stackoverflow.com/questions/1580647/json-why-are-forward-slashes-escaped — Barmar, May 19 '15 at 00:58
But it shouldn't be a problem -- the slashes will be removed when you decode the JSON. — Barmar, May 19 '15 at 00:59
@AramilRey Why would the OP try that if it's not working ? you're escaping a single quote `'`. — Pedro Lobito, May 19 '15 at 01:06

score 10 · Accepted Answer · answered May 19 '15 at 01:10

When you call json_encode, use the JSON_UNESCAPED_SLASHES option to prevent it from escaping slashes.

But this shouldn't really be necessary. If you're outputing JSON, you should be sending it to a program that parses JSON, and the JSON parser will translate \/ to /.

score 0 · Answer 2 · answered May 19 '15 at 00:53

0

if this is the only pattern you are expecting you can use str_replace('\/', '/', $url) You can also use str_replace(array('\/', '\\'), array('/', '\'), $url) for more patterns

answered May 19 '15 at 00:53

device_exec

1,686
1
9
7

PHP remove weird backslashes from url

2 Answers2