0

I want to scrape data from https://angel.co/companies?locations[]=1688-United+States. Can anyone please guide me what should I do?

I know I should use BeautifulSoup or Selenium but eventually I found out that this web page is not static it changes its data time by time can anyone please guide me through it?

I think the angellist API web page is not working anymore.

dspencer
  • 4,297
  • 4
  • 22
  • 43
vish
  • 235
  • 1
  • 3
  • 16

1 Answers1

1

You need to wait few second till table on page is generated:

from selenium import webdriver
import os
import time

chrome_driver = os.path.abspath(os.path.dirname(__file__)) + '/chromedriver'
browser = webdriver.Chrome(chrome_driver)
browser.get("https://angel.co/companies?locations[]=1688-United+States")
time.sleep(3)

data_row = browser.find_elements_by_class_name('base.startup')

for item in data_row:
    print('-'*100)
    company = item.find_element_by_class_name('name').text
    location = item.find_element_by_class_name('column.location').text
    print(company)
    print(location)

Output:

----------------------------------------------------------------------------------------------------
WP Engine
Austin
----------------------------------------------------------------------------------------------------
Kissmetrics
San Francisco
----------------------------------------------------------------------------------------------------
Bluesmart
San Francisco
----------------------------------------------------------------------------------------------------
Star.me
Los Angeles
...
...
Zaraki Kenpachi
  • 5,510
  • 2
  • 15
  • 38
  • @vish read this: https://stackoverflow.com/questions/2953834/windows-path-in-python – Zaraki Kenpachi Feb 06 '20 at 06:52
  • hello when i use this path "\Users\Dell User\Downloads\Compressed\chromedriver_win32" it shows an earror: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \UXXXXXXXX escape please help – vish Feb 06 '20 at 06:54
  • @Zarakai Kenpachi can you please make a correct path of C:\Users\Dell User\Downloads\Compressed\chromedriver_win32 how should i write it ? i have tried chrome_driver = "C:\\Users\\Dell User\\Downloads\\Compressed\\chromedriver_win32" still not working shows path does not found – vish Feb 06 '20 at 07:00
  • 1
    @vish sorry con't test that. I'm Linux user – Zaraki Kenpachi Feb 06 '20 at 07:02