I am looking to scrape some data from a chemical database using R, mainly name
, CAS Number
, and molecular weight
for now. However, I am having trouble getting rvest
to extract the information I'm looking for. This is the code I have so far:
library(rvest)
library(magrittr)
# Read HTML code from website
# I am using this format because I ultimately hope to pull specific items from several different websites
webpage <- read_html(paste0("https://pubchem.ncbi.nlm.nih.gov/compound/", 1))
# Use CSS selectors to scrape the chemical name
chem_name_html <- webpage %>%
html_nodes(".short .breakword") %>%
html_text()
# Convert the data to text
chem_name_data <- html_text(chem_name_html)
However, when I'm trying to create name_html
, R only returns character (empty). I am using SelectorGadget
to get the HTML node, but I noticed that SelectorGadget
gives me a different node than what the Inspector does in Google Chrome. I have tried both ".short .breakword"
and ".summary-title short .breakword"
in that line of code, but neither gives me what I am looking for.