Split a column in two in r

Question

My df looks like this :

Time
Week End 07-01-10
Week End 07-02-10

I want it as

Column        Time
Week End   07-01-10
Week End   07-02-10

I googled package stringr would be useful but I am unable to use it correctly since there are two spaces.

Your pasted df looks like it only has one space. What code have you tried? — Nate, Jul 19 '16 at 15:02
There are two spaces; one after **Week** and the other after **End**. I did not try any code as of now — Vinay billa, Jul 19 '16 at 15:08

Psidom · Accepted Answer · 2016-07-19T15:31:26.660

2

You can use extract from tidyr package where you can specify regular expressions to split the column:

library(tidyr)
extract(df, Time, into = c("Column", "Time"), "(.*)\\s(\\S+)")
#     Column     Time
# 1 Week End 07-01-10
# 2 Week End 07-02-10

Use (.*)\\s(\\S+) to capture two groups and split on the space which is followed by a group which contains no space \\S+.

If you want to use stringr package, you can use str_match function with similar functionality:

stringr::str_match(df$Time, "(.*)\\s(\\S+)")[, 2:3]
#      [,1]       [,2]      
# [1,] "Week End" "07-01-10"
# [2,] "Week End" "07-02-10"

strsplit also works if you specify the space to be the one before the digit, here ?= stands for look ahead and \\d is a abbreviation for digits and is equivalent to [0-9]:

do.call(rbind, strsplit(df$Time, "\\s(?=\\d)", perl = T))
#      [,1]       [,2]      
# [1,] "Week End" "07-01-10"
# [2,] "Week End" "07-02-10"

edited Jul 19 '16 at 15:31

answered Jul 19 '16 at 15:08

Psidom

209,562
33
339
356

Thanks a lot Psidom, but what do I need to do if there are more columns in the dataframe and I want to create a dataframe with just a change to this column ? – Vinay billa Jul 19 '16 at 15:24
Use the first version, `extract` from tidyr package, it should leave other columns untouched. – Psidom Jul 19 '16 at 15:26
It throws the following error **Error: could not find function "extract"** – Vinay billa Jul 19 '16 at 15:29
You need to load the `tidyr` package. Updated the answer to make it more obvious. And also try to reinstall your tidyr package, it might be your package is too old. – Psidom Jul 19 '16 at 15:31

score 1 · Answer 2 · answered Jul 19 '16 at 17:43

We can use read.table from base R. No packages needed

read.table(text=sub("\\s+(\\S+)$", ",\\1", df1$Time), header=FALSE, 
     col.names = c("Column", "Time"), stringsAsFactors=FALSE, sep=",")
#    Column     Time
#1 Week End 07-01-10
#2 Week End 07-02-10

score 0 · Answer 3 · answered Jul 19 '16 at 15:25

Here is a base-R solution.

df <- data.frame(c("Week End 07-01-10", "Week End 07-02-10"),
                 stringsAsFactors=FALSE)
names(df) <- "Time"

# Assuming all columns end with (time?) in the same format.
df$Column <- substring(df$Time, 0, nchar(df$Time)-9)
df$Time <- substring(df$Time, nchar(df$Time)-8, nchar(df$Time))
df <- df[, c(2,1)]; df # Changing column order

Split a column in two in r

3 Answers3