3

I couldn't convert a double shortened URL to expanded URL successfully using the below function I got from here:

function doShortURLDecode($url) {
        $ch = @curl_init($url);
        @curl_setopt($ch, CURLOPT_HEADER, TRUE);
        @curl_setopt($ch, CURLOPT_NOBODY, TRUE);
        @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
        @curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
        $response = @curl_exec($ch);
        preg_match('/Location: (.*)\n/', $response, $a);
        if (!isset($a[1])) return $url;
        return $a[1];
    }

I got into trouble when the expanded URL I got was again a shortened URL, which has its expanded URL.

How do I get final expanded URL after it has run through both URL shortening services?

kittycat
  • 14,983
  • 9
  • 55
  • 80
Chetana Kestikar
  • 570
  • 6
  • 23
  • http://stackoverflow.com/questions/4911271/getting-final-urls-of-shortened-urls-like-bit-ly-using-php – Johnny X. Lemonade Jan 31 '13 at 08:41
  • 2
    @HonzaM. those won't work as `t.co` uses HTML redirection through JS or a META tag, not HTTP headers. – kittycat Jan 31 '13 at 08:44
  • works fine nowadays, just lowercase the location in your preg_match: `preg_match('/location: (.*)\n/', $response, $a);` and `trim()` the result. At least you get the bit.ly url, not the final result. But with the bit.ly domain you can simply use the bit.ly API. – larrydahooster Jun 05 '14 at 09:02

4 Answers4

1

Since t.co uses HTML redirection through the use of JavaScript and/or a <meta> redirect we need to grab it's contents first. Then extract the bit.ly URL from it to perform a HTTP header request to get the final location. This method does not rely on cURL to be enabled on server and uses all native PHP5 functions:

Tested and working!

function large_url($url) 
{
    $data = file_get_contents($url); // t.co uses HTML redirection
    $url = strtok(strstr($data, 'http://bit.ly/'), '"'); // grab bit.ly URL

    stream_context_set_default(array('http' => array('method' => 'HEAD')));
    $headers = get_headers($url, 1); // get HTTP headers

    return (isset($headers['Location'])) // check if Location header set
        ? $headers['Location'] // return Location header value
        : $url; // return bit.ly URL instead
}

// DEMO
$url = 'http://t.co/dd4b3kOz';
echo large_url($url);
kittycat
  • 14,983
  • 9
  • 55
  • 80
  • Thanks for the help... but i am not getting the final url yet. I tried `$url=http://t.co/dd4b3kOz;echo large_url($url);` to which i get the output as `http://bit.ly/IRnYVz`. But if i run this url in browser, it sends me to `http://changeordie.therepublik.net/?p=371#proliferation`... I want to get the last url... – Chetana Kestikar Jan 31 '13 at 08:06
  • @ChetanaKestikar So you are trying to expand a URL that is shortened with `bit.ly` and then further shortened with `Twitter`? TinyURL is not being used at all. Ok, before I modify the code please tell me if you are sure these are the only two URL shortening services applied to the URL. – kittycat Jan 31 '13 at 08:10
  • You can't do this for `t.co` domains as it uses HTML to perform a redirect `` You would need to parse the contents of the page first and then go from there. Give me a few mins. – kittycat Jan 31 '13 at 08:18
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/23697/discussion-between-chetana-kestikar-and-cryptic-) – Chetana Kestikar Jan 31 '13 at 08:36
1

Finally found a way to get the final url of a double shortened url. The best way is to use longurl api for it.

I am not sure if it is the correct way, but i am at last getting the output as the final url needed :)

Here's what i did:

<?php
 function TextAfterTag($input, $tag)
 {
        $result = '';
        $tagPos = strpos($input, $tag);

        if (!($tagPos === false))
        {
                $length = strlen($input);
                $substrLength = $length - $tagPos + 1;
                $result = substr($input, $tagPos + 1, $substrLength); 
        }

        return trim($result);
 }

 function expandUrlLongApi($url)
 {
        $format = 'json';
        $api_query = "http://api.longurl.org/v2/expand?" .
                    "url={$url}&response-code=1&format={$format}";
        $ch = curl_init();
        curl_setopt ($ch, CURLOPT_URL, $api_query );
        curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 0);
        curl_setopt($ch, CURLOPT_HEADER, false);
        $fileContents = curl_exec($ch);
        curl_close($ch);
        $s1=str_replace("{"," ","$fileContents");
        $s2=str_replace("}"," ","$s1");
        $s2=trim($s2);
        $s3=array();
        $s3=explode(",",$s2);
        $s4=TextAfterTag($s3[0],(':'));
        $s4=stripslashes($s4);
        return $s4;
 }
 echo expandUrlLongApi('http://t.co/dd4b3kOz');
?>

The output i get is:

"http://changeordie.therepublik.net/?p=371#proliferation"

The above code works.

The code that @cryptic shared is also correct ,but i could not get the result on my server (maybe because of some configuration issue).

If anyone thinks that it could be done by some other way, please feel free to share it.

Chetana Kestikar
  • 570
  • 6
  • 23
0

Perhaps you should just use CURLOPT_FOLLOWLOCATION = true and then determine the final URL you were directed to.

Mike Brant
  • 70,514
  • 10
  • 99
  • 103
0

In case the problem is not a Javascript redirect as in t.co or a <META http-equiv="refresh"..., this is reslolving stackexchange URLs like https://stackoverflow.com/q/62317 fine:

public function doShortURLDecode($url) {
    $ch = @curl_init($url);
    @curl_setopt($ch, CURLOPT_HEADER, TRUE);
    @curl_setopt($ch, CURLOPT_NOBODY, TRUE);
    @curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
    @curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    $response = @curl_exec($ch);
    $cleanresponse= preg_replace('/[^A-Za-z0-9\- _,.:\n\/]/', '', $response);
    preg_match('/Location: (.*)[\n\r]/', $cleanresponse, $a);
    if (!isset($a[1])) return $url;
    return parse_url($url, PHP_URL_SCHEME).'://'.parse_url($url, PHP_URL_HOST).$a[1];
}

It cleans the response of any special characters, that can occur in the curl output before cuttoing out the result URL (I ran into this problem on a php7.3 server)

rubo77
  • 19,527
  • 31
  • 134
  • 226