5

THE CONTEXT

I have a piece of (jQuery) ajax code that has been happily working for about 9 months until the last couple of weeks or so.

This code uses Instagram's embedding endpoints that allows me to get the media source (image or video) out of a normal Instagram link like http://instagram.com/p/BUG/ regardless the user and without needing an access_token.

Simplified example :

var URL = "http://api.instagram.com/oembed?url=http://instagram.com/p/BUG/";
$(document).ready(function () {
    $.ajax({
        url: URL,
        dataType: "jsonp",
        cache: false,
        success: function (response) {
            console.log(response.url);
        },
        error: function () {
            console.log("couldn't process the instagram url");
        }
    });
});

In the code above, response.url would return the full media URL source like :

http://photos-a.ak.instagram.com/xxxx/1234_123456123_123456_n.jpg // image or
http://distilleryvesper3-15.ak.instagram.com/b0c957463548362858_101.mp4 // video

Then I could use the returned URL to embed the media file in my webpage.

NOTE :

Since the idea is to get the URL source of any Instagram link regardless the user, using media endpoints is not an option.


THE ISSUE

Instagram's oembed endpoints allows you to GET a json response, which until the last couple of weeks had this structure :

{
    "provider_url" : "http:\/\/instagram.com\/",
    "media_id" : "123456789_123456789",
    "title" : "the title",
    "url" : "http:\/\/photos-a.ak.instagram.com\/hphotos-ak-xfp1\/12345678_123456789012345_1234567890_n.jpg",
    "author_name" : "{the user name}",
    "height" : 640,
    "width" : 640,
    "version" : "1.0",
    "author_url" : "http:\/\/instagram.com\/{the user name}",
    "author_id" : 123456789,
    "type" : "photo",
    "provider_name" : "Instagram"
}

As you may noticed, my ajax code was particularly interested in the property name url, which contains the full media's URL.

Notice that this json response (as today) is still valid for Instagram images, however, if the original Instagram's link is a video, let's use a real example : http://instagram.com/p/mOFsFhAp4f/ (CocaCola(c)) the json response doesn't return any url key anymore.

It seems that after the introduction of web embeds Instagram has decided to replace the key url by a html property in their (oembed) json response for videos only, which contains the iframe to embed like :

{
    ...

    "html" : "\u003ciframe src=\"http:\/\/instagram.com\/p\/BUG\/embed\" width=\"616\" height=\"716\" frameborder=\"0\" scrolling=\"no\" allowtransparency=\"true\"\u003e\u003c\/iframe\u003e",

    ...
}

... and of course, that breaks my code since response.url is undefined.


THE QUESTION

How do I get the full video's URL after the changes in the Instagram json response?

Unfortunately I couldn't find any proper documentation or a change log in Instagram's developers site (they have a great API but poor documentation.)

Please notice that the question is about Instagram API (v1) embedding endpoints rather than a jQuery or ajax question.

I am looking for (an undocumented perhaps) Instagram's API option, endpoint, oembed or else (that doesn't require access_token) that allows me to retrieve the direct link to the media video (after a json response preferably) out of a normal Instagram link regardless the user ...or willing to consider a not too hacky workaround.

JFK
  • 40,963
  • 31
  • 133
  • 306
  • As I see `BUG` key is a `shortcode` key. have you changed it? – mortymacs Jul 06 '14 at 04:02
  • @MortezaN.Alamdari : please check [Instagram documentation](http://instagram.com/developer/embedding/#) if you knwo what I mean. `BUG` is the shortcode of the media's ID (that can be any like `mOFsFhAp4f` is in http://instagram.com/p/mOFsFhAp4f/ ) – JFK Jul 06 '14 at 04:04
  • Why don't you check the type of media and then decide whether you need `response.url` or `response.html` ? – Jashwant Jul 06 '14 at 05:58
  • @Jashwant : sure thing, but `response.html` is an `iframe`. What I am looking for should look like `http://distilleryvesper3-15.ak.instagram.com/b0c957463548362858_101.mp4`. Can you get that out of the `iframe`? (without Firebug of course but from an app ;) If so, please post your answer. – JFK Jul 06 '14 at 06:06
  • You do not want to use `embed` link and want a url with `mp4` link ? [This code](http://jsfiddle.net/xAgPS/) will not work for you. Right ? – Jashwant Jul 06 '14 at 06:17
  • @Jashwant : at this point, I am not interested in using iframes (I may end up using them if I have no choice) but I would rather prefer to get the full URL of the mp4 file (so I can use my preferred player like MEJS). An additional issue with embed is: how do you `autoplay`? http://jsfiddle.net/SDRXV/ – JFK Jul 06 '14 at 06:26
  • @JFK hope my answer is not too hacky for you , i believe it works well for any provided link. – ProllyGeek Jul 06 '14 at 08:11

3 Answers3

6

This may not be the best or optimum answer , but as i believe this will solve your issue for now , so you may consider it a work around:

Thanks to whateverorigin.org service we are able to fetch cross origin json , which has all the data you may request , all you have to do is converting the returned object to string , then use regex to fetch whatever data you need.

var myvideourl="http://instagram.com/p/mOFsFhAp4f/"
$.ajaxSetup({
    scriptCharset: "utf-8", //maybe "ISO-8859-1"
    contentType: "application/json; charset=utf-8"
});

$.getJSON('http://whateverorigin.org/get?url=' + 
    encodeURIComponent(myvideourl) + '&callback=?',
    function(data) {

        var xx=data.contents
        var dataindex=xx.search('<meta property="og:video" content=')
        var end=xx.indexOf('/>', dataindex);
        var yy=xx.slice(dataindex,end+2)
        var metaobject=$.parseHTML(yy)
        alert(metaobject[0].content)
        console.log(metaobject[0].content)
});

Here is and example:

JS Fiddle Demo

works well for me , but only tried it on the CocaCola video , havent tried it on other links.

Jashwant
  • 28,410
  • 16
  • 70
  • 105
ProllyGeek
  • 15,517
  • 9
  • 53
  • 72
  • 1
    Beautiful. Ironically I have used `whateverorigin.org` for other solutions, including this answer http://stackoverflow.com/a/24559815/1055987 but it didn't cross my mind for this specific scenario. I am reluctant to use third-party services since they may stop working any time. The good thing is that `whateverorigin` is open source so I could host the service myself for my client's needs. Kudos for your answer, I just need to make some considerations and wait for other possible answers before granting any bounty ;) – JFK Jul 06 '14 at 08:35
  • ok no problem , just hope i could help , good luck finding the best answer :) – ProllyGeek Jul 06 '14 at 08:58
  • an alternative that may or may not be less likely to go down is `https://cors-anywhere.herokuapp.com/` – dw1 Jun 02 '20 at 06:07
4

UPDATE [March 2015] : For an extended and updated version of this solution, please visit http://www.picssel.com/build-a-simple-instagram-api-case-study/


@ProllyGeek's answer provided a good workaround to scrape the Instagram video page (well deserved bounty), however it relies on the whateverorigin.org third-party service, which will work fine unless the service eventually becomes unavailable.

Since the latest already happened to me in a production environment, I had to look for a more reliable alternative so I decided to use php's file_get_contents to scrape the video link from an own-hosted PHP module.

I basically followed the same logic proposed by @ProllyGeek but translated to PHP so:

The getVideoLink.php module :

<?php
header('Content-Type: text/html; charset=utf-8');
function clean_input($data){
    $data = trim($data);
    $data = stripslashes($data);
    $data = strip_tags($data);
    $data = htmlspecialchars($data);
    return $data;
};
$instalink = clean_input( $_GET['instalink'] );    
if (!empty($instalink)) {
    $response = clean_input( @ file_get_contents( $instalink ) );
    $start_position = strpos( $response ,'video_url&quot;:&quot;' ); // the start position
    $start_positionlength = strlen('video_url&quot;:&quot;'); // string length to trim before
    $end_position = strpos($response ,'&quot;,&quot;usertags'); // the end position
    $mp4_link = substr( $response, ( $start_position + $start_positionlength ), ( $end_position - ( $start_position + $start_positionlength ) ) );
    echo $mp4_link;
};
?>

Of course, you may need to analyze the response manually to know what you are looking for.

Then the AJAX call to the PHP module from my main page :

var instaLink = "http://instagram.com/p/mOFsFhAp4f/"; // the Coca Cola video link
jQuery(document).ready(function ($) {
    $.ajax({
        url: "getVideoLink.php?instalink="+instaLink,
        dataType : "html",
        cache : false,
        success : function (data) {
            console.log(data); // returns http://distilleryvesper3-15.ak.instagram.com/b0ce80e6b91111e3a16a122b8b9af17f_101.mp4
        },
        error : function () {
            console.log("error in ajax");
        }
    });
}); // ready 

It's assumed your host supports php to use this method.


EDIT [November 19, 2014]

I have modified the getVideoLink.php module (now getInstaLinkJSON.php) to actually get the JSON information from an specific Instagram media link like http://instagram.com/p/mOFsFhAp4f/

This is much more useful than just scraping the video's URL and can be used for images too.

The new getInstaLinkJSON.php code :

<?php
function clean_input($data){
    $data = trim($data);
    $data = strip_tags($data);
    return $data;
};
// clean user input
function clean_input_all($data){
    $data = trim($data);
    $data = stripslashes($data);
    $data = strip_tags($data);
    $data = htmlspecialchars($data);
    return $data;
};
$instaLink = clean_input_all( $_GET['instaLink'] );

if( !empty($instaLink) ){
    header('Content-Type: application/json; charset=utf-8');
    $response = clean_input( @ file_get_contents($instaLink) );
    $response_length = strlen($response);
    $start_position = strpos( $response ,'window._sharedData = ' ); // the start position
    $start_positionlength = strlen('window._sharedData = '); // string length to trim before
    $trimmed = trim( substr($response, ( $start_position + $start_positionlength ) ) ); // trim extra spaces and carriage returns
    $jsondata = substr( $trimmed, 0, -1); // remove extra ";" added at the end of the javascript variable 
    echo $jsondata;
} elseif( empty($instaLink) ) {
    die(); //only accepts instaLink as parameter
}
?>

I am sanitizing both the user's input and the file_get_contents() response, however I am not stripping slashes or HTML characters from the last since I will be returning a JSON response.

Then the AJAX call:

var instaLink = "http://instagram.com/p/mOFsFhAp4f/"; // demo
jQuery.ajax({
    url: "getInstaLinkJSON.php?instalink=" + instaLink,
    dataType : "json", // important!!!
    cache : false,
    success : function ( response ) {
        console.log( response ); // returns json
        var media = response.entry_data.DesktopPPage[0].media;

        // get the video URL
        // media.is_video : returns true/false

        if( media.is_video ){
            console.log( media.video_url ); // returns http://distilleryvesper3-15.ak.instagram.com/b0ce80e6b91111e3a16a122b8b9af17f_101.mp4
        }
    },
    error : function () {
        console.log("error in ajax");
    }
});

EDIT [May 20, 2020]

currently working PHP

<?php
header("Access-Control-Allow-Origin: *");
header("Access-Control-Allow-Headers: *");
function clean_input($data){
    $data = trim($data);
    $data = strip_tags($data);
    return $data;
};
// clean user input
function clean_input_all($data){
    $data = trim($data);
    $data = stripslashes($data);
    $data = strip_tags($data);
    $data = htmlspecialchars($data);
    return $data;
};
$instaLink = clean_input_all( $_GET['instaLink'] );

if( !empty($instaLink) ){
    header('Content-Type: application/json; charset=utf-8');
    $response = clean_input( @ file_get_contents($instaLink) );
    $response_length = strlen($response);
    $start_position = strpos( $response ,'window._sharedData = ' ); // the start position
    $start_positionlength = strlen('window._sharedData = '); // string length to trim before
    $trimmed = trim( substr($response, ( $start_position + $start_positionlength ) ) ); // trim extra spaces and carriage returns
    $jsondata = substr( $trimmed, 0, -1); // remove extra ";" added at the end of the javascript variable 
    $jsondata = explode('window.__initialDataLoaded', $jsondata);
    echo substr(trim($jsondata[0]), 0, -1);
} elseif( empty($instaLink) ) {
    die(); //only accepts instaLink as parameter
}
?>
Mert Aksoy
  • 372
  • 1
  • 4
  • 12
JFK
  • 40,963
  • 31
  • 133
  • 306
  • im using your answer now for my own issue :D kudos to this solution ;) – ProllyGeek Nov 17 '14 at 21:34
  • 1
    @ProllyGeek : I have improved the code since then, I guess I will post an update any time soon ;) – JFK Nov 17 '14 at 21:54
  • please do it asap ty :) – ProllyGeek Nov 17 '14 at 21:55
  • @ProllyGeek : I did further adjustments in the trimming method – JFK Nov 19 '14 at 21:06
  • I'm not sure if they've changed it since then, but this code didn't work for me - even the updated version. The `$jsondata` variable returned a bunch of html `"); $trim2 = trim(substr($trim1,0,$pos2-1)); $jsondata = json_decode($trim2);` – Chud37 Jul 24 '17 at 09:39
  • @Chud37 please refer to http://www.picssel.com/build-a-simple-instagram-api-case-study/ which has updated information (haven't had time to update this answer) – JFK Jul 25 '17 at 01:58
0

I am not a jQuery expert. Putting aside the syntax error(s), could this be any use?

var publicUrl = "http://instagram.com/p/dAu7UPgvn0"; //photo
var publicUrl = "http://instagram.com/p/mOFsFhAp4f"; //video


var URL = "http://api.instagram.com/oembed?url="+publicUrl;

$(document).ready(function () {
    $.ajax({
        url: URL,
        publicurl: publicUrl,
        dataType: "jsonp",
        cache: false,
        success: function (response) {
            success: function (response) {
                var mediaSrc;
                if (response.type === 'photo') {
                    mediaSrc = response.url;
                } else {
                    mediaSrc = $(publicurl).find('div.Video vStatesHide Frame').src;
                }
                console.log(mediaSrc);
            }
        },
        error: function () {
            console.log("couldn't process the instagram url");
        }
    });
});
J A
  • 1,776
  • 1
  • 12
  • 13
  • This is similar to my answer :) `$(publicurl).find('div.Video vStatesHide Frame').src` is incorrect. Either you can use `[0].src` like me, or you can use `attr('src')` – Jashwant Jul 06 '14 at 06:51
  • @Jashwant I am sure it is ;). But unlike your method, I am not suggesting to rely on api returns. It is more like web scrapping since the op is quite desperate to get the actual source. – J A Jul 06 '14 at 06:55
  • 1
    You cannot scrape it like this. See Prolly's answer. – Jashwant Jul 07 '14 at 16:43
  • Instagram has removed type as `photo` or `video`, now it returns `rich` for type field – Rishabh Agrahari Aug 25 '17 at 16:29
  • http://www.picssel.com/build-a-simple-instagram-api-case-study/ provided link is not working – Mert Aksoy May 29 '20 at 17:50