1

I would like to split one column into two within at data frame based on a delimiter. For example,

METHAMPHETAMINE | MORPHINE
THC

to became

METHAMPHETAMINE  MORPHINE
THC

within a data frame.

Thanks!

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360

2 Answers2

1

One option is to use gsub() to identify and extract either side of the vertical pipe in your single column:

df <- data.frame(col=c('METHAMPHETAMINE | MORPHINE', 'THC'))
df$col1 <- gsub("(.*)\\s*\\|.*", "\\1", df$col)
df$col2 <- ifelse(grepl(".*\\|\\s*(.*)", df$col),
                  gsub(".*\\|\\s*(.*)", "\\1", df$col), 
                  "")

df
                         col             col1      col2
1 METHAMPHETAMINE | MORPHINE  METHAMPHETAMINE  MORPHINE
2                        THC              THC

Demo

Your question is something of a curve ball, because from your sample data we can see that not every row in the data frame may have a pipe with a second column. It is a feature of gsub() to return the input string when no match is found, but in our case, we just want empty string if no pipe is present. One option here is to use grepl() to first check if the pattern matches, and if not, then return empty string.

Possibly a duplicate of this question, though a new application of the answers given there would be required to answer your question.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0
> a1=read.csv("test.csv",header=F)
> a1
                          V1
1 METHAMPHETAMINE | MORPHINE
2                        THC


> within(a1, FOO<-data.frame(do.call('rbind', strsplit(as.character(V1), '|', fixed=TRUE))))
                          V1           FOO.X1    FOO.X2
1 METHAMPHETAMINE | MORPHINE METHAMPHETAMINE   MORPHINE
2                        THC              THC       THC

from

Split column at delimiter in data frame

Ajay Ohri
  • 3,382
  • 3
  • 30
  • 60