I would like to scrape titles, abstracts, claims, and inventor names from google patents and add this to an existing csv file. Could you please help me in this? A sample of my code is as follows:
# Create empty lists to store extracted information
claim_list = []
# Define a function to extract application number and claims from a URL and add them to the lists
def add_info_to_lists(url):
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
# Extract claims
claims = [claim.get_text(strip=True) for claim in soup.select("li.claim, li.claim-dependent")]
if claims:
claim_text = " ".join(claims)
claim_list.append(claim_text)
else:
claim_list.append("N/A")
A similar snippet seems to work with strings (e.g. application numbers), but it does not with other json elements.
Thank you in advance!