I would like to import data from an HTML page (specifically, https://www.warzone.com/MultiPlayer/Tournament?ID=34581) into a Google Sheets sheet. Normally this would be accomplished using IMPORTHTML
(or, in more hacky ways, IMPORTXML
or IMPORTDATA
).
I want to get the table contents from the Players tab on the URL above. In my sheet, this should look like (with |'s separating cells)
Accepted Player | Team | Rank | Wins | Losses | In-progress
JSA | | 1st place | 6 | 0 | 0
hexe1309 | | 2nd place | 5 | 1 | 0
Sebus Maximus | | 3rd place | 3 | 3 | 0
Master Jz | | 4th place | 2 | 4 | 0
Blortis | | 5th place | 2 | 4 | 0
Rex Paul of Lizardino | | 6th place | 2 | 4 | 0
Benjamin628 | | 7th place | 1 | 5 | 0
But when I run IMPORTHTML(<the URL>, "table", 2)
, I instead get just the top row:
Accepted Player | Team | Rank | Wins | Losses | In-progress
If I use IMPORTXML
or IMPORTDATA
, I still do not get any contents from the populated table- just the column headers.
If you view page source on the URL above, you'll notice that, not coincidentally, only the part of this table is populated. The issue is that the initial request made when querying the page of interest returns a template- a page with empty tables. So IMPORTHTML
, IMPORTXML
, and IMPORTDATA
return a page with tables that have just the headers populated- but no contents.
The empty tables don't yet have the data that I want; that gets filled in by some jQuery code. Presumably that jQuery code is fetching the data from somewhere before populating the table, but I don't see any requests in my browser's Network tab corresponding to this. I don't understand how web applications work or how to find out where this data is being fetched from.
From my perspective, my options are:
To somehow have Google Sheets (like a headless browser) fetch the data from after the jQuery logic has populated the table, rather than from the initial page returned by the GET request; OR
To somehow run/replicate the functionality of whatever the jQuery is doing, and fetch that data myself
I'm guessing #2 is more viable, since IMPORTHTML
/IMPORTXML
/IMPORTDATA
are not that sophisticated. For #2, I don't have the debugging know-how to trace this website's jQuery code (or to find the place where fetches the data to fill in the table). Are there any devtools (in Firefox or Chrome) I can use to find out what happens behind the scenes to fetch this data and replicate that logic in my sheet? (My main gap here is almost certainly my lack of understanding of how this data is being fetched and where to find the trace of this fetching.)
n.b.: This could probably be done very easily by just using a script to update the sheet. I'll probably end up doing this with Python and gspread. But I'd like to know if it's actually impossible/infeasible to accomplish this in Google Sheets alone.