I have a vector chars
of some characters:
chars <- c("check24 smavey dr klein", "smava", "check24, interhyp",
"verivox check24 dr. klein", "dr. klein", NA, "dr. weber",
"dr. klein,", NA, "check24 verivox")
The goal is to paste/insert "_" if they have white space between them and fulfill the following conditions:
- There is no comma between the sequence (e.g.
Name1, Name2 Name3
should becomeName1, Name2_Name3
). - There is no point between them (e.g.
Dr. Name1 Name2 Name3
should becomeDr. Name1_Name2_Name3
). - The length between the whitespace is and the charcter sequence is >= 4 on both sides (e.g.
AAA AAAA AAAA AAAA
should becomeAA AAAA_AAAA_AAAA
).
I tried using this function:
library(stringr)
f = function(x) {
ifelse(grepl(".{4} .{4}", x) & !grepl(",|[A-z]{2}/. ", x), str_replace_all(x, "\\s+", "_"), x)
}
f(chars)
#> [1] "check24_smavey_dr_klein" "smava" "check24, interhyp" "verivox_check24_dr._klein"
#> [5] "dr. klein" NA "dr. weber" "dr. klein,"
#> [9] NA "check24_verivox"
The problem is that I can't execute the cases in a sequence (e.g. [1]
or [4]
)
Any idea how to do this?