1

I'm attempting to copy all of the actual content from my Tumblr blog using a script I wrote on a different web page, but I'm having a bit of trouble with gaining access to the content. My ajax call is as follows:

$.ajax({
     url: "http://solacingsavant.tumblr.com/",
     dataType: 'jsonp',
     success: function(data) {
          var elements = $("<div>").html(data)[0].getElementsByTagName("ul")[0].getElementsByTagName("li");
          for(var i = 0; i < elements.length; i++) {
               var theText = elements[i].firstChild.nodeValue;
               alert(theText); // Alert if I got something
              // This is where I'll strip the data for the items I want
          }
     }
});

but as it is the console gives me an error of "Resource interpreted as Script but transferred with MIME type text/html" which I looked into here and changed the corresponding meta tag in the HTML of my blog to <meta http-equiv="Content-Type" content="application/javascript; charset=utf-8" /> with no success

I also tried using dataType: 'html' (which makes more sense to me) but I was getting a console error of "Origin is not allowed by Access-Control-Allow-Origin" which I also looked into and added a meta tag to my Tumblr blog with <meta Access-Control-Allow-Origin="*" />, but again didn't succeed

Here is a jsFiddle to work with

Does my approach not work because Tumblr as a whole does not allow changes to Access-Control? If so, how might I work around the issue? If not, what am I doing wrong?

MAJOR EDIT (based on mikedidthis's helpful comments)

It seems that I am not able to do this without a Tubmlr API, so I obtained an API key and now have access to the json results that the API sends out. I am able to get a jsonp object using the API key to in the console. My javascript at the moment:

$.ajax({
    url: "http://api.tumblr.com/v2/blog/solacingsavant.tumblr.com/info?api_key=APIkeyGoesHeRe",
    dataType: 'jsonp',
    success: function(results){
        console.log(results); 
        // Get data from posts here
    }
});

This SO post was helpful in understanding how I can change data on my Tubmlr page from the source and find out basic information about the site, but not about how to obtain actual data from individual posts. I tried looking through the results object and was unable to find any data related to posts, nor was I able to append the results to the jsfiddle. So my questions now are, "Can I copy data (say the written text in a post) from individual posts using this approach? If so, how? If not, what other approach should I use?"

Community
  • 1
  • 1
Zach Saucier
  • 24,871
  • 12
  • 85
  • 147
  • CORS. You will need to use the official api to grab your content: http://www.tumblr.com/docs/en/api/v2 – mikedidthis Oct 15 '13 at 08:32
  • You could use the old API which doesn't need auth, but it could be pulled at any time. – mikedidthis Oct 15 '13 at 12:11
  • So even if it's my personal site I can't allow it for outside pages to access it? – Zach Saucier Oct 15 '13 at 12:12
  • afaik no. As the tumblr site isn't yours? Your wanting to add content from Tumblr to your own site. Opening access to your site isn't going to help here? – mikedidthis Oct 15 '13 at 12:54
  • What I meant by that is, "Even if I have access to the Tumblr's code (because I am the author) there is no way for me to allow a non-Tumblr site to access it?" – Zach Saucier Oct 15 '13 at 13:22
  • Correct. You maybe the author of the theme code, but you want to get at Tumblrs data for your posts. Even though you wrote them, you still need to use the API to access them outside of Tumblr. I hope that helps. – mikedidthis Oct 15 '13 at 14:08
  • @mikedidthis Thanks for your insight! It's incredibly helpful and put me in the right direction. I have now updated my question with the current situation, feel free to look and see if you have any more feedback – Zach Saucier Oct 15 '13 at 16:25

1 Answers1

5

A really quick answer

The tumblr API documentation really covers using the API well, however, to give you a little start, lets grab all your Text Posts.

First you need to query the API for any of your post that are of the type Text.

The documentation states (http://www.tumblr.com/docs/en/api/v2#posts) that we should use the following url and specifying the type which we you will set to text:

api.tumblr.com/v2/blog/solacingsavant.tumblr.com/posts[/type]

And below is an example based on the OP fiddle.

$.ajax({
    url: "http://api.tumblr.com/v2/blog/solacingsavant.tumblr.com/posts/text?api_key=XXXXXXX",
    dataType: 'jsonp',
    success: function(data){
        posts = data.response.posts
        $.each(posts, function(i) {
            console.log( posts[i].title, posts[i].body )
        });
    }
});

So for each query of the API, we will receive back an object. You will need to filter this object to get the data you want from it.

In context of the post queries, you can get directly at your posts using data.response.posts object.

To find out what data is available for each post type, the documentation has it covered: http://www.tumblr.com/docs/en/api/v2#text-posts

To the content for each of the Text post types, you need to loop through the posts object and then grab the value for the key named title and body.

Example here: http://jsfiddle.net/ZpFwL/

Bonus Time It is possible to get posts for all types, by dropping the type from the URL:

http://api.tumblr.com/v2/blog/solacingsavant.tumblr.com/posts/?api_key=XXXXXXX"

Remember this is a really, quick example and not for the real world.

mikedidthis
  • 4,899
  • 3
  • 28
  • 43
  • Good answer, I believe this is correct. After going through more of the documentation it seems that I have to make a different ajax call for each type of post which kind of stinks. Please let me know if this isn't the case! – Zach Saucier Oct 15 '13 at 18:00
  • Nope, you can fudge it by removing the type: api.tumblr.com/v2/blog/solacingsavant.tumblr.com/posts?api_key=xxx However, you will need to build a filter function for each post type. – mikedidthis Oct 15 '13 at 18:05
  • No problem. Out of interest, why not just use Tumblr as your main site? My question is based on not know what content your taking from Tumblr to your own site. – mikedidthis Oct 15 '13 at 18:30
  • My end goal is to copy all of the content of any Tumblr blog when the owner inputs the URL to on offline resource, filtering out the unnecessary parts. I'm simply not familiar with the Tumblr API (obviously) so I figure I'd create it using a technique I know then perhaps build a real API afterwards – Zach Saucier Oct 15 '13 at 18:42
  • Ahh makes sense, something like this: http://boutofcontext.com/tumblr_backup.php Its also worth mention each Tumblr has an RSS feed: http://solacingsavant.tumblr.com/rss which could be used for scrapping? – mikedidthis Oct 15 '13 at 18:48
  • Sure, but hopefully better formatted and such. I hope to have two optional end products (chosen by selection using the same application) using it, one straight HTML like the one you gave and the other more like a book (perhaps in PDF form or something similar) with just the text and photos from each post which can be printed. And yes, I know of the RSS feed, I am working on the book version at the moment so I need to strip it further and in a more organized way – Zach Saucier Oct 15 '13 at 18:57