-1

Heyho, I want to split my dataframe which looks like:

ID  name1_attr1  name1_attr2  name2_attr2  ...
1   2            3            1
2   1            3            4
3   3            4            2
4   6            7            5

into:

ID  name  attr1  attr2
1   1     2      3 
2   1     1      3
3   1     3      4
4   1     6      7
1   2            1
2   2            4
3   2            2
4   2            5

I am really not sure how to do that? Do you have any hint or start for me? Thanks in advance :)

waterline
  • 67
  • 1
  • 6
  • I think one of the column names is not correct. The last column name would be `name2_attr2` in the input dataset to match with the output in your post – akrun Jan 10 '18 at 10:44

2 Answers2

3

We can use melt from data.table which can take multiple patterns

library(data.table)
melt(setDT(df),measure = patterns("attr1", "attr2"), 
         value.name = c("attr1", "attr2"), variable.name = "name")
#   ID name attr1 attr2
#1:  1    1     2     3
#2:  2    1     1     3
#3:  3    1     3     4
#4:  4    1     6     7
#5:  1    2    NA     1
#6:  2    2    NA     4
#7:  3    2    NA     2
#8:  4    2    NA     5

data

df <- structure(list(ID = 1:4, name1_attr1 = c(2L, 1L, 3L, 6L), name1_attr2 = c(3L, 
 3L, 4L, 7L), name2_attr2 = c(1L, 4L, 2L, 5L)), .Names = c("ID", 
 "name1_attr1", "name1_attr2", "name2_attr2"), class = "data.frame", row.names = c(NA, 
 -4L))
akrun
  • 874,273
  • 37
  • 540
  • 662
2

You can try a tidyverse solution and using a combination of gather and spread.

d <- read.table(text="ID  name1_attr1  name1_attr2  name2_attr2
1   2            3            1
                2   1            3            4
                3   3            4            2
                4   6            7            5", header=T)
library(tidyverse)
d %>% 
  gather(k, v, -ID) %>% 
  separate(k, c("name","b"), sep =  "_") %>% 
  spread(b, v, fill = "") %>% 
  arrange(name)
  ID  name attr1 attr2
1  1 name1     2     3
2  2 name1     1     3
3  3 name1     3     4
4  4 name1     6     7
5  1 name2           1
6  2 name2           4
7  3 name2           2
8  4 name2           5
Roman
  • 17,008
  • 3
  • 36
  • 49