I have searched around for the past few days and I see in XPath v2 you can use the 'except' operator, but haven't been able to figure out how xml2 can handle this.
This link is sort of what I want to do, but this is specific to XPath, and I'm trying to do a blanket exclusion of a node like in this SO answer.
For example, my test document is a .docx
which I unzip and read. It has body text and a table. I want to read all the body text, except anything in a table. I can read both, but I can't figure out how to exclude all the w:tbl
. Any not
or except
operators don't seem to work.
With xml_find_all
it scrapes anything within those nodes, without exception.
bodytext <- xml2::xml_find_all(doc, "//w:p")
tabletext <- xml2::xml_find_all(doc, "//w:tbl")