Split data frame elements with semicolon in R

Question

I've tried to create a function that replaces semicolon-containing elements in a dataframe column with splitted entries that are places on the bottom of the column, using basic R. The main purpose is to use this function with apply and make the addition whenever detecting an entry with semicolon.

The main problem with my code is that it returns the exact same data frame without any additional values.

> df
rs2480711
rs74832092
rs4648658
rs4648659
rs61763535
rs28733941;rs67677371

>x
"rs28733941;rs67677371"

function(x){
semiCols = length(unlist(strsplit(x, ";")))
elementsRs = unlist(strsplit(x, ";"))
if(semiCols>1){
for(i in 1:semiCols){
df = rbind(df, elementsRs[i])
}}}

I would also like to know how can I expand the code in order to split rows based on one value leaving all the others unchanged. For example, this

>df
0  rs61763535             T1
1  rs28733941;rs67677371  T2

will look like this

>df2
0  rs61763535             T1
1  rs28733941             T2
1  rs67677371             T2

@Sotos. Tried it but for some reason it deletes all entries with blank values, something I really don't wish for. — civy, Jul 19 '16 at 08:41
`splitstackshape::cSplit(df, 'V2', ';', 'long')` works for me — Sotos, Jul 19 '16 at 08:45
For some reason it doesn't work when put within function brackets. — civy, Jul 19 '16 at 10:51

user2100721 · Answer 1 · 2016-07-15T12:37:05.487

1

If I understood correctly, this will work

unlist(strsplit(as.character(df$V1),split = ";"))

Again, I couldn't get you properly. But, maybe you are looking for this

apply(df,2,function(t) unlist(strsplit(as.character(t),split = ";")))

edited Jul 15 '16 at 12:37

answered Jul 15 '16 at 08:52

user2100721

3,557
2
20
29

Thanks, saved my day! Do you know how can I possibly extend this assuming that I have multiple columns? I want to keep all the other values the same, but create n new rows where n is the number of semicolons. – civy Jul 15 '16 at 09:48
I get error "dim(X) must have a positive length" when I apply it to a multi-column data frame. I'm interested in splitting only one column and copy the values of the rest of them in order to have identical rows but without semicolon entries. – civy Jul 18 '16 at 12:37
I've edited my original post, sorry for the inconvenience, – civy Jul 19 '16 at 08:23

Split data frame elements with semicolon in R

1 Answers1