I am working in R to analyze a complex structure of a web, and I want to extract the information that is contained in font tag, but happens to be that the data from tables are as well between font tags.
XPath examples:
text/div/font
table/tbody/tr/td/div/font
Since the structure is very complex, I can not predict the exact Xpath, so I am using //font as xpath to extract relevant data, but since the information in tables are contained as well in tags font, I am getting information that is not relevant for my analysis.
xpathCodefont <- "//font"
htmlCodeFonts <- xpathSApply(htmlCode,xpathCodefont,xmlValue)
Is there any syntax that allow me "to skip" the fonts that are coming from a path with tables? Or in other words, how could I avoid fonts that have table as ancestors (but not as direct parent).
Thanks in advance,