0

Sorry for this surely basic question, but I really couldn't find a clear answer:

I have a data frame I'm trying to split by a fixed number of characters

I'd previously been using:

data = cSplit(data, 'variable', sep="what to separate on", type.convert=FALSE)

...to split, but that causes issues if there are multiple instances of the "what to split on" in the data.

I want to split this names column by the first four characters (as splitting on "." causes issues for names with multiple "."

names
Mr. Joe Smith
Ms. Sarah A. Jones
Mr. Ryan P. Young
Ms. Chelsea White

Data

names <- c('Mr. Joe Smith', 'Ms. Sarah A. Jones', 'Mr. Ryan P. Young', 'Ms. Chelsea White')
rawr
  • 20,481
  • 4
  • 44
  • 78
Jim
  • 715
  • 2
  • 13
  • 26
  • 1
    If it's just the first four characters, maybe you can use `substr(x, 1,4)` and `substr(x, 5, nchar(x))`? – Frank Apr 26 '16 at 18:26
  • @Frank Perfect. Works like a charm. Thanks for the simple tip for this R newbie! – Jim Apr 26 '16 at 18:30
  • Np, linking it to another question with a lot more alternatives. – Frank Apr 26 '16 at 18:32
  • this sounds like an xy problem. you can sub out the first `.` and split based on whatever you sub in `strsplit(sub('\\.\\s+', ';', names), ';')` – rawr Apr 26 '16 at 21:01

0 Answers0