WebScraping a date value that appears as January 1970 instead of true value with Python

Question

I'm getting stuck at a webscraping project, I would like to webscrape the following website and the dates for each of the reviews. However I get 'January 1970' for all of the dates. https://fairygodboss.com/company-reviews/ebay-inc

Here is my code:

page_link = 'https://fairygodboss.com/company-reviews/ebay-inc' # for work/life balance for EBAY
page_response = requests.get(page_link, verify=False, headers={'User-Agent': randomUserAgents()})
soup = BeautifulSoup(page_response.content, 'html.parser')
soup.find_all(class_='textColor6 w-700 p-b-10')

Many thanks!

When I looked at the site, all of the dates DID show as January 1970 before it kicked me for not having a login. Looks like some sort of site behavior. — G. Anderson, Nov 16 '18 at 17:46
That was exactly what happened to me as well. I believe he has to log in from Python first. — Caleb H., Nov 16 '18 at 17:49

score 1 · Accepted Answer · answered Nov 16 '18 at 17:48

1

I believe your problem is that, when you make your request, you are not logged in. When a user is not logged in, all the dates appear as January 1970, until you are redirected to a login page. You will first have to log in.

This can be a tricky problem, but there is a library for python called twill that may work for you: http://twill.idyll.org

Alternatively, you could use something like the Mechanize library, which twill is based on.

This StackOverflow question should help you out: How to scrape a website that requires login first with Python

answered Nov 16 '18 at 17:48

Caleb H.

1,657
1
10
31

I've found that requests + sessions is the right tool for this job. Python mechanize is abandoned and I've never heard of twill. – pguardiario Nov 16 '18 at 23:45
I've logged in using requests + session but it still only shows me January 1970 – sammtt Nov 17 '18 at 22:28
To help with that I'd have to see your code again, as well as what the website looks like while logged in. – Caleb H. Nov 19 '18 at 15:28

WebScraping a date value that appears as January 1970 instead of true value with Python

1 Answers1