Missing Table Elements When Scraping

Question

URL: https://stats.nba.com/player/1628381/defense-dash/

Attempting to get:

 `<table>
  <tbody>
    <!----><tr data-ng-repeat="(i, row) in page" index="0">
      <td class="player">Overall</td>
      <td>45</td>
      <td>45</td>
      <td>5.7</td>
      <td>12.3</td>
      <td>46.6</td>
      <td>100%</td>
      <td>46.7</td>
      <td>-0.1</td>
    </tr><!---->
  </tbody>
</table> `

My coding:

 public static void getData(String url, String Name, int ID) throws 
IOException
{
    String html = Jsoup.connect(url).execute().body();
    html = html.replaceAll("<!---->", "");
    html = html.replaceAll("<!--", "");
    html = html.replaceAll("-->", "");
    Document doc = Jsoup.parse(html);
    Elements tableElements = doc.select("table");
    
    System.out.println("Elements " + tableElements);
    
    for (Element tableElement : tableElements)
    {
        String tableId = tableElement.id();
        if (tableId.isEmpty()) {
            continue;
    }
        String fileName = "table" + Name + tableId + ID + ".csv";
        System.out.println(fileName);
        FileWriter writer = new FileWriter(new File("C:\\Users\\noman\\eclipse-workspace\\Senior Project\\src\\", fileName));

        //System.out.println(doc);
        Elements tableRowElements = tableElement.select(":not(thead) tr td");

        for (int i = 0; i < tableRowElements.size(); i++) {
            Element row = tableRowElements.get(i);
            Elements rowItems = row.select("td");
            for (int j = 0; j < rowItems.size(); j++) {
                writer.append(rowItems.get(j).text());

                if (j != rowItems.size() - 1) {
                    writer.append(',');
                }
            }
            writer.append('\n');
        }

Problem is no elements are being found. this same code works on another site perfectly which (seemingly) no differences in how they store data

Is there something different with this website preventing web-scraping? or a subtle difference maybe?

Please note HTML code provided is a shorten version

likely the content is added by javascript. take a look at the html that is retreived by jsoup and see if those are there. — mavriksc, Feb 27 '19 at 21:37
Not seeing them in fact. Tho the doc is very very long and possible i missed it — Novabomb, Feb 27 '19 at 21:46
https://stackoverflow.com/questions/35586658/how-to-access-updated-html-source-after-the-javascript-on-the-page-has-been-exec — mavriksc, Feb 27 '19 at 21:56
Using my browsers debugger (Network tab) I checked the data is dynamically loaded from this URL: https://stats.nba.com/stats/playerdashptshotdefend?DateFrom=&DateTo=&GameSegment=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PerMode=PerGame&Period=0&PlayerID=1628381&Season=2018-19&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision= Jsoup will not parse json so you have to use different library. — Krystian G, Feb 28 '19 at 00:08

TDG · Answer 1 · 2019-03-05T17:58:21.183

As said at the comments, the data you are looking for is loaded dynamically, but, you can fetch it with a simple GET request from this link -
https://stats.nba.com/stats/playerdashptshotdefend?DateFrom=&DateTo=&GameSegment=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PerMode=PerGame&Period=0&PlayerID=1628381&Season=2018-19&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision=
EDIT
To find this link I've used the browser's developer tools and checked for xhr requests.
You can see that the link includes several parameters, among them the playerID which is identical to the number that appears in your intial link. By changing its value you can get stats of other players.

How did you get to this? The program I'm coding will be grabbing this data for all players on this site 1 by 1 — Novabomb, Mar 04 '19 at 18:47

Missing Table Elements When Scraping

1 Answers1