trying to learn something about web scraping I thought - it would be a good goal to have a data-driven page - with lots of data to gather from - like clutch.co
I am trying to do some first steps in scraping - whilst running a tiny scraper like so.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import pandas as pd
options = Options()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(options=options)
driver.get("https://clutch.co/it-services/msp")
page_source = driver.page_source
driver.quit()
soup = BeautifulSoup(page_source, "html.parser")
# Extract the data using some BeautifulSoup selectors
# For example, let's extract the names and locations of the companies
company_names = [name.text for name in soup.select(".company-name")]
company_locations = [location.text for location in soup.select(".locality")]
# Store the data in a Pandas DataFrame
data = {
"Company Name": company_names,
"Location": company_locations
}
df = pd.DataFrame(data)
# Save the DataFrame to a CSV file
df.to_csv("clutch_data.csv", index=False)
but at the moment this runs with an empty result -
note I am working on google-colab
.