Fetch data from web page

Question

I'm trying to scrape the first paragraph from wikipedia using only javascript. Basically, what I want to do is

document.getElementsByTagName("P")[0]

except it's not on my web page, I want to fetch a given page from wikipedia and apply that functionality. My current codes gives:

Uncaught TypeError: undefined is not a function

My code:

function getWikiDescription(searchTerm)
{
    var theURL = "http://en.wikipedia.org/wiki/" + searchTerm.replace(" ", "_");
    var article = null;
    $.get(theURL, function(data){
        wikiHelper(data);
    }, "html");
}
function wikiHelper(data)
{
    alert(data);
    console.log(data.getElementByTagName("p")[0]);
}
getWikiDescription("godwin's law");

data basically becomes a giant string containing all of the html, but the getElementByTagName function doesn't work. Any help would be appreciated, thanks in advance.

It’s `getElementsByTagName`, with an `s`. (That might still not work, but it’s a start!) — Ry-, Sep 06 '14 at 22:51
That's a string, not a dom tree. `getElementsByTagName` works on a dom tree, not on a string (and string actually hasn't that method). — Axel Amthor, Sep 06 '14 at 23:01

score 0 · Answer 1 · edited May 23 '17 at 12:21

0

Browsers generally do not allow sending ajax requests to domains different from the one where the script originates. You can't just send and ajax request to any page of your liking like that. Read about same origin policy and about ways to circumvent this.

edited May 23 '17 at 12:21

Community

1
1

answered Sep 06 '14 at 23:00

Slavic

1,891
2
16
27

score 0 · Answer 2 · answered Sep 07 '14 at 00:28

you can use JSONP which both jQuery and the WikiMedia API support (by honouring the ?callback=? query param)

"use strict";

var endpoint = 'http://en.wikipedia.org/w/api.php';

$.ajax({
    url: endpoint,
    crossDomain: true,
    dataType: 'jsonp',
    data: {
        format: "json",
        action: "parse",
        page: "Bay_View_Historical_Society"
    },
    error: function(xhr,status,error){
        alert( error );
    }
}).done(function(rawhtml){
    var dom_object = $( '<div>' + rawhtml.parse.text['*'] + '</div>' );
    var p = $(dom_object).find('p').first();
    p.appendTo('#output');
});

working example:

http://jsfiddle.net/sean9999/0h4t0ybd/2/

jQuery is not strictly necessary, but it makes the code concise and readable.

The code does the following:

Makes a JSONP request for the content
Pulls the markup down as text
Converts the text to a DOM structure
Queries the DOM structure for the first node ( doc.getElementsByTagName("P")[0] )

Fetch data from web page

2 Answers2