I am using cheerio to grab stats information from https://www.nba.com/players/langston/galloway/204038 but I can't the table data to show up

Question

[the information i want to access][1]


  [1]: https://i.stack.imgur.com/4SpCU.png

NO matter what i do i just can't access the table of stats. I am suspicious it has to do with there being mulitple tables but I am not sure. enter code here

var cheerio = require("cheerio");

   var axios = require("axios");



axios
  .get("https://www.nba.com/players/langston/galloway/204038")
  .then(function (response) {
    var $ = cheerio.load(response.data);

    console.log(
      $("player-detail").find("section.nba-player-stats-traditional").find("td:nth-child(3)").text()

    );


  });

Have you inspected the HTML returned from your call? Looking at the page in chrome dev tools it looks like most of the data is loaded asynchronously and I don't see a table in the html returned. For example it looks like [this link](https://data.nba.net/prod/v1/2019/players/204038_profile.json) has the player profile returned as JSON which would be much easier to use anyway... — Jason Goemaat, Apr 30 '20 at 03:47
That looks perfect! took me a sec to realize I didn't need cheerio anymore. I just grabbed responce.data from my axios call. if you have a sec could you explain why I couldn't grab the table even though I could see it on the page? is it because it gets loaded after? — Graham Thomas, Apr 30 '20 at 19:48

score 2 · Accepted Answer · answered Apr 30 '20 at 20:16

The actual html returned from your get request doesn't contain the data or a table. When your browser loads the page, a script is executed that pulls the data from using api calls and creates most of the elements on the page.

If you open the chrome developer tools (CTRL+SHIFT+J) and switch to the network tab and reload the page you can see all of the requests taking place. The first one is the html that is downloaded in your axios GET request. If you click on that you can see the HTML is very basic compared to what you see when you inspect the page.

If you click on 'XHR' that will show most of the API calls that are made to get data. There's an interesting one for '204038_profile.json'. If you click on that you can see the information I think you want in JSON format which is much easier to use without parsing an html table. You can right-click on '204038_profile.json' and copy the full url:

https://data.nba.net/prod/v1/2019/players/204038_profile.json

NOTE: Most websites will not like you using their data like this, you might want to check what their policy is. They could make it more difficult to access the data or change the urls at any time.

You might want to check out this question or this one about how to load the page and run the javascript to simulate a browser.

The second one is particularly interesting and has an answer saying how you can intercept and mutate requests from puppeteer

Ah that makes a lot of sense. I am only using this particular site to build something for a class. I would never consider scraping a site like this with nonacademic purposes. I'll check out those other ones for sure. Thank you again! — Graham Thomas, Apr 30 '20 at 22:09

I am using cheerio to grab stats information from https://www.nba.com/players/langston/galloway/204038 but I can't the table data to show up

1 Answers1