-3

A column in a dataset looks like

Factor w/ 163305 levels "['032']","['A10', 'A11', 'B31']",..: 1 76209 134581 134581 75649 134581 84340 134871 74475 87044 ...

Is there a way to separate ['A10', 'A11', 'B31'] into three columns, each consisting different alphabet-letter?

Frank
  • 66,179
  • 8
  • 96
  • 180
R.L.
  • 1
  • 2
    Hi, welcome to SO. Please consider reading up on [ask] and how to produce a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). It makes it easier for others to help you. – Heroka Feb 29 '16 at 20:36

1 Answers1

0

Try:

# Data (I assume that each value is separated by 1 comma and some other punctuation)
x <- c("['032']","['A10', 'A11', 'B31']")

# Find maximum number of values in 1 string (counts the commas in each string and returns the maximum number + 1, as that is the most values there are)
mx <- max(sapply(gregexpr("\\,",x),length)) + 1

# Create a matrix containing each value in a separate column; str_split_fixed can take an argument that will determine the number of columns (mx in our case)
library(stringr)
str_split_fixed(gsub("[^[:alnum:],]","",x),",",mx)
#      [,1]  [,2]  [,3] 
# [1,] "032" ""    ""   
# [2,] "A10" "A11" "B31"

If each string only has one value, then you'll get a matrix with two columns, of which the second column will only have empty strings. Otherwise, it should work just fine.

slamballais
  • 3,161
  • 3
  • 18
  • 29