I have a regex pattern of pattern = "\\d+(?:[.,]\\d+)*\\s*mu\\b|\\b(?:kinetin|zeatin)\\b"
I have a function to find matches in a data frame (this function was given to me so it might not make sense to my pattern)
find.all.matches <- function(search.col,pattern){
captured <- str_match_all(search.col,pattern = pattern)
t <- lapply(captured, str_trim)
t2 <- lapply(t, function(x) gsub("[^a-z]","",x))
t3 <- sapply(t2, unique)
t4 <- lapply(t3, toString)
found.col <- unlist(t4)
return(found.col)
}
To find matches, I am using the chunk of code below and then I am adding the matches to a column in the data frame.
testing <- find.all.matches(search.col = all_data_1gen$abstract_l,
pat = pattern)
all_data_1gen$testing_matches <- testing
The pattern is pulling out basically only mu, kinetin and zeatin but in regex101: https://regex101.com/r/77oi0A/1 it shows my regex pattern pulling out amounts (which is what I need). Is the function the issue or the pattern the issue and how would I fix this?
Here's my string:
high-frequency somatic embryogenesis was achieved from an embryogenic cell suspension culture of acanthopanax koreanum nakai. stem segments were cultured on murashige and skoog (ms) medium containing auxins and cytokinins. opaque and friable embryogenic callus formed on ms medium with 4.5 mu m 2,4-dichlorophenoxyacetic acid (2,4-d) and 2.0 mu m kinetin or zeatin, but was highest on medium containing 4.5 mu m 2,4-d alone.