0

I've tried to create a function that replaces semicolon-containing elements in a dataframe column with splitted entries that are places on the bottom of the column, using basic R. The main purpose is to use this function with apply and make the addition whenever detecting an entry with semicolon.

The main problem with my code is that it returns the exact same data frame without any additional values.

> df
rs2480711
rs74832092
rs4648658
rs4648659
rs61763535
rs28733941;rs67677371

>x
"rs28733941;rs67677371"

function(x){
semiCols = length(unlist(strsplit(x, ";")))
elementsRs = unlist(strsplit(x, ";"))
if(semiCols>1){
for(i in 1:semiCols){
df = rbind(df, elementsRs[i])
}}}

I would also like to know how can I expand the code in order to split rows based on one value leaving all the others unchanged. For example, this

>df
0  rs61763535             T1
1  rs28733941;rs67677371  T2

will look like this

>df2
0  rs61763535             T1
1  rs28733941             T2
1  rs67677371             T2
civy
  • 393
  • 2
  • 17

1 Answers1

1

If I understood correctly, this will work

unlist(strsplit(as.character(df$V1),split = ";"))

Again, I couldn't get you properly. But, maybe you are looking for this

apply(df,2,function(t) unlist(strsplit(as.character(t),split = ";")))
user2100721
  • 3,557
  • 2
  • 20
  • 29
  • Thanks, saved my day! Do you know how can I possibly extend this assuming that I have multiple columns? I want to keep all the other values the same, but create n new rows where n is the number of semicolons. – civy Jul 15 '16 at 09:48
  • I get error "dim(X) must have a positive length" when I apply it to a multi-column data frame. I'm interested in splitting only one column and copy the values of the rest of them in order to have identical rows but without semicolon entries. – civy Jul 18 '16 at 12:37
  • I've edited my original post, sorry for the inconvenience, – civy Jul 19 '16 at 08:23