0

I am using Mechanize to sign into LinkedIn and get all the employees of a certain company. However when I download the page with the search results of the employees it is missing the whole middle and I have no idea why.

Here is my code (took out my linkedin sign in info):

from mechanize import Browser
from bs4 import BeautifulSoup
br=Browser()
br.set_handle_robots(False)
br.open('https://www.linkedin.com/')
br.select_form('login')
br['session_key']=YOUR_EMAIL_HERE
br['session_password']=YOUR_PASSWORD_HERE
response=br.submit()
page=br.open('https://www.linkedin.com/vsearch/p?f_CC=10667')
html=page.read()
soup=BeautifulSoup(html)
text=soup.prettify()
text=text.encode("ascii", "ignore")
fo= open("website.html",'wb')
fo.write(text)
fo.close()

The response is this (I recommend downloading the HTML and just looking at it with a browser): http://pastebin.com/7z1dPiTd

I am not sure if I used the open function correctly, that may be the problem.

vvvvv
  • 25,404
  • 19
  • 49
  • 81
jped
  • 486
  • 6
  • 19

1 Answers1

0

Alright, After doing some research it seems that Mechanize was not waiting for the Javascript to load and therefore I was not downloading the correct info. Mechanize does not provide a method for waiting for the Javascript, so I have to use either windmill or selenium look at these: here and here

Community
  • 1
  • 1
jped
  • 486
  • 6
  • 19