I would like to split one column into two within at data frame based on a delimiter. For example,
METHAMPHETAMINE | MORPHINE
THC
to became
METHAMPHETAMINE MORPHINE
THC
within a data frame.
Thanks!
I would like to split one column into two within at data frame based on a delimiter. For example,
METHAMPHETAMINE | MORPHINE
THC
to became
METHAMPHETAMINE MORPHINE
THC
within a data frame.
Thanks!
One option is to use gsub()
to identify and extract either side of the vertical pipe in your single column:
df <- data.frame(col=c('METHAMPHETAMINE | MORPHINE', 'THC'))
df$col1 <- gsub("(.*)\\s*\\|.*", "\\1", df$col)
df$col2 <- ifelse(grepl(".*\\|\\s*(.*)", df$col),
gsub(".*\\|\\s*(.*)", "\\1", df$col),
"")
df
col col1 col2
1 METHAMPHETAMINE | MORPHINE METHAMPHETAMINE MORPHINE
2 THC THC
Your question is something of a curve ball, because from your sample data we can see that not every row in the data frame may have a pipe with a second column. It is a feature of gsub()
to return the input string when no match is found, but in our case, we just want empty string if no pipe is present. One option here is to use grepl()
to first check if the pattern matches, and if not, then return empty string.
Possibly a duplicate of this question, though a new application of the answers given there would be required to answer your question.
> a1=read.csv("test.csv",header=F)
> a1
V1
1 METHAMPHETAMINE | MORPHINE
2 THC
> within(a1, FOO<-data.frame(do.call('rbind', strsplit(as.character(V1), '|', fixed=TRUE))))
V1 FOO.X1 FOO.X2
1 METHAMPHETAMINE | MORPHINE METHAMPHETAMINE MORPHINE
2 THC THC THC
from