I would like tot identify nodes via text()
that contain text with "Umlaute".
library(xml2)
library(rvest)
doc <- "<p>Über uns </p>" %>% xml2::read_html()
grepl(pattern = "Über uns", x = as.character(doc))
grepl(pattern = "Über uns", x = doc)
Questions:
How can I extract the node containing the text "Über uns"?
what tried:
https://forum.fhem.de/index.php?topic=96254.0
Java XPath umlaut/vowel parsing
# does not work
xp <- paste0("//*[contains(text(), 'Über uns')]")
html_nodes(x = doc, xpath = xp)
# does not work
xp <- paste0("//*[translate(text(), 'Ü', 'U') = 'Uber uns']")
html_nodes(x = doc, xpath = xp)
# does not work
xp <- paste0("//*[contains(text(), 'Über uns')]")
html_nodes(x = doc, xpath = xp)
# this works but i wonder if there is a solution with xpath
doc2 <- doc %>%
as.character() %>%
gsub(pattern = "Ü", replacement = "Ue") %>%
xml2::read_html()
xp <- paste0("//*[contains(text(), 'Ueber uns')]")
html_nodes(x = doc2, xpath = xp)