A total newbie here in search for your wisdom (1st post/question, too)! Thank you in advance for you time and patience.
I am hoping to automatize scientific literature searches in Google Scholar using Selenium specifically (via Chrome) with Python. I envision entering a topic, which will be searched on Google Scholar, and then entering each link of the articles/books in the results, extracting the abstract/summary, and printing them on the console (or saving them on a text file). This will be an easy way to determine the relevancy of the articles in the results for the stuff that I'm writing.
Thus far, I am able to visit Google scholar, enter text in the search bar, filter by date (newest to oldest), and extract each of the links on the results. I have not been able to write a loop that will enter each article link and extract the abstracts (or other relevant text), as each result may have been coded differently.
Kind regards,
- JP (Aotus_californicus)
This is my code so far:
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
def get_results(search_term):
url = 'https://scholar.google.com'
browser = webdriver.Chrome(executable_path=r'C:\Users\Aotuscalifornicus\Downloads\chromedriver_win32\chromedriver.exe')
browser.get(url)
searchBar = browser.find_element_by_id('gs_hdr_tsi')
searchBar.send_keys(search_term)
searchBar.submit()
browser.find_element_by_link_text("Trier par date").click()
results = []
links = browser.find_elements_by_xpath('//h3/a')
for link in links:
href = link.get_attribute('href')
print(href)
results.append(href)
browser.close()
get_results('Primate thermoregulation')