How to separate a column

Question

The first column of my dataframe is a factor that contains two sets of information: the type of activation works (A1-4) and the month when it was carried out (about 50 observations in YYMM format). A simplified version could look like this:

A = data.frame(type.month=c("A1.1605", "A2.1605", "A1.1604", "A2.1604"), value=sample(1:4))

> A
  type.month value
1    A1.1605     2
2    A2.1605     4
3    A1.1604     1
4    A2.1604     3

I would like to get the types into one column and the months into another and I read that normally this could be done with the reshape2 package when the variables are neatly separated (say e.g. the first half is only A1 and the second half is only A2). However, mine alternate (A1,A2,A1...) and contain two information (type and month). Is reshape2 still a good tool in this case or I should think about something else?

My point is to keep the four type of activation works and months in one dataframe so that I do not have to store them in four different files.

You're going to want stringr. Check this out! http://stackoverflow.com/questions/4350440/split-a-column-of-a-data-frame-to-multiple-columns — Joy, Oct 14 '16 at 18:42
Joy, thanks, I'll check it out! @Frank, hope the title is better now. — babesz, Oct 14 '16 at 18:46

Wietze314 · Accepted Answer · 2016-10-14T20:31:30.777

2

This separates the string using tidyr function separate:

A = data.frame(type.month=c("A1.1605", "A2.1605", "A1.1604", "A2.1604"), value=sample(1:4))


library(dplyr)
library(tidyr)
A %>% separate(type.month, c('type','month')) %>% arrange(type, desc(month))

gives

type  month      value
A1    1605       4
A1    1604       2
A2    1605       1
A2    1604       3

edited Oct 14 '16 at 20:31

answered Oct 14 '16 at 18:47

Wietze314

5,942
2
21
40

This works, thank you! Just wondering if there is any way to reorder the dataframe like this: – babesz Oct 14 '16 at 19:38
type month value A1 1605 2 A1 1604 1 A2 1605 4 A2 1604 3 – babesz Oct 14 '16 at 19:39
sorry, my above comment looks ugly. Basically I would like to order the df by type, not month. Is it possible to set that in `tidyr`? – babesz Oct 14 '16 at 19:43
found it: `Ord1=order(A$`, `A[Ord1,]` #I love this forum, thank you! – babesz Oct 14 '16 at 19:59
very elegant, thanks! – babesz Oct 14 '16 at 20:47

score 0 · Answer 2 · edited Oct 14 '16 at 18:52

0

Drat, I forgot to mention regular expressions. You'll have to escape out the period like this:

library(stringr)
str_split_fixed(A$type.month, "\\.", 2)

edited Oct 14 '16 at 18:52

Jaap

81,064
34
182
193

answered Oct 14 '16 at 18:48

Joy

769
6
24

How to separate a column

2 Answers2