1

I am scraping a simple web page using Python. When I look at the web page using Chrome I get:

8/13 2:20 PM - this is the correct time for my time zone

In on one of the date strings (returns as a string)

I am using requests.get (Same results with urllib2) with the call:

thepage = requests.get('http://fakepage.com')

and when I get everything decoded I get: 8/13 4:20 PM - since it's off by two hours exactly, I assume it's the server side detecting my time zone.

is there any way to send my time zone in the requests.get call? or perhaps I am looking at it incorrectly.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
AndyH
  • 9
  • 1
  • 2

2 Answers2

0

No, your browser doesn't send any time information when making a request. The server would not be able to adjust the timezone in the HTML sent.

The common way to do what you observed is using JavaScript; see get client time zone from browser for example. Your browser executed code that translated the datetime in the page to your local timezone, not the server.

This is far more efficient as well; the server just has to send one version of the HTML page to every client; this is far more cachable.

Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks for the response - I'm a noob here so please be patient. This leads me back to a local environment setting. It's an ASP.NET site so the code is actually running on the server building the HTML for the client to render. (right?) The shared server where I run the python script has the correct time. Is there any way to set it up so the ASP.NET site has the right time zone? – AndyH Aug 14 '14 at 00:59
  • @AndyH - That's entirely up to that particular ASP.net code, which you didn't show us in your question. – Matt Johnson-Pint Aug 14 '14 at 05:40
  • @AndyH: sure you can, but why not do this after parsing the date on the client instead? – Martijn Pieters Aug 14 '14 at 08:35
  • The data is coming to me as plain text. it's a very simple page returned. I can convert the text to time then adjust for the timezone. I may not be asking the right question. How come when I get a page with Chrome does it have the right time (for my time zone) but when I use Request.get it has a differnt time zone. – AndyH Aug 14 '14 at 17:04
  • @AndyH: then *parse the datetime and convert it*. – Martijn Pieters Aug 14 '14 at 17:06
  • @MartijnPieters - My edit crossed your response. I can do that, it just doesn't seem like I should have to. Why is the time displayed by Chrome different than what is returned by request.get? – AndyH Aug 14 '14 at 17:17
  • @AndyH: I already stated that; Chrome is probably instructed to run JavaScript code to do the conversion. – Martijn Pieters Aug 14 '14 at 17:18
-2

SOLVED - I had a one character typo in the request which returned everything exactly the same HTML payload between Chrome and request.get EXCEPT for the time.

Thanks everyone for your time.

AndyH
  • 9
  • 1
  • 2