0

I'm trying to scrape the first paragraph from wikipedia using only javascript. Basically, what I want to do is

document.getElementsByTagName("P")[0]

except it's not on my web page, I want to fetch a given page from wikipedia and apply that functionality. My current codes gives:

Uncaught TypeError: undefined is not a function

My code:

function getWikiDescription(searchTerm)
{
    var theURL = "http://en.wikipedia.org/wiki/" + searchTerm.replace(" ", "_");
    var article = null;
    $.get(theURL, function(data){
        wikiHelper(data);
    }, "html");
}
function wikiHelper(data)
{
    alert(data);
    console.log(data.getElementByTagName("p")[0]);
}
getWikiDescription("godwin's law");

data basically becomes a giant string containing all of the html, but the getElementByTagName function doesn't work. Any help would be appreciated, thanks in advance.

Blackhole
  • 20,129
  • 7
  • 70
  • 68
gmaster
  • 692
  • 1
  • 10
  • 27

2 Answers2

0

Browsers generally do not allow sending ajax requests to domains different from the one where the script originates. You can't just send and ajax request to any page of your liking like that. Read about same origin policy and about ways to circumvent this.

Community
  • 1
  • 1
Slavic
  • 1,891
  • 2
  • 16
  • 27
0

you can use JSONP which both jQuery and the WikiMedia API support (by honouring the ?callback=? query param)

"use strict";

var endpoint = 'http://en.wikipedia.org/w/api.php';

$.ajax({
    url: endpoint,
    crossDomain: true,
    dataType: 'jsonp',
    data: {
        format: "json",
        action: "parse",
        page: "Bay_View_Historical_Society"
    },
    error: function(xhr,status,error){
        alert( error );
    }
}).done(function(rawhtml){
    var dom_object = $( '<div>' + rawhtml.parse.text['*'] + '</div>' );
    var p = $(dom_object).find('p').first();
    p.appendTo('#output');
});

working example:

http://jsfiddle.net/sean9999/0h4t0ybd/2/

jQuery is not strictly necessary, but it makes the code concise and readable.

The code does the following:

  1. Makes a JSONP request for the content
  2. Pulls the markup down as text
  3. Converts the text to a DOM structure
  4. Queries the DOM structure for the first node ( doc.getElementsByTagName("P")[0] )
code_monk
  • 9,451
  • 2
  • 42
  • 41