1

As part of my discovery of web scraping, I'd like to browse and get all of my Strava activities. I'll use the profile of Thibaut Pinot as an example. I'm using Python 3 and Requests.

On the user's page, one can see every of his activities, but not all at once. Indeed, they are chronologically sorted, so you have to use a timeline. You can then choose to display activities weekly or monthly and choose the period of time: all of this is done by GET requests. More precisely, the fragment identifier matches the following regexp:

(interval_type|graph_date_range)?chart_type=miles&interval_type=(week|month)&interval=[1-9]{6}&year_offset=[1-9]+

The first group doesn't seem to matter at all. Then, interval_type specifies whether to display weekly or monthly results. interval allows us to choose the date to display, using the format YYYYMM where YYYY is the year, and MM the month/week to display. Finally, year_offset isn't really useful. Thus, the GET request is fairly straightforward to make: I just have to choose a monthly display and iterate over the different months I want to monitor.

However, you can notice that while loading https://www.strava.com/pros/1603067#interval_type?interval=201802&interval_type=month&chart_type=miles&year_offset=0 (that is, the page that displays the runs of February 2018), the results of the current month are first displayed, and only then the results of February 2018. Thus, using requests.get always gives my the same page, no matter what fragment identifier I set.

My web browser must get a new web page after the first one (the one with the current month) is loaded, but how could I get it using Python ?

Spirine
  • 1,837
  • 1
  • 16
  • 28
  • 1
    `requests` will only get the initial HTML page content, not any content subsequently loaded via JavaScript. There are endless duplicates here about web scraping sites with JS, see e.g. https://stackoverflow.com/q/8049520/3001761. – jonrsharpe Mar 10 '18 at 13:56
  • You have to find the URL where the page loads other stuff from, and make a request to it. – decadenza Mar 10 '18 at 14:00
  • 1
    Can you share your current code, current and desired output? – Andersson Mar 10 '18 at 14:37
  • @decadenza What is the best solution to do so ? – Spirine Mar 15 '18 at 20:11

0 Answers0