0

I have a data frame (control.sub) containing multiple columns (t1,t2,t3,t4,t5,t6). I want to merge all these columns into one, also NA should be removed.

> control.sub
                             t1                         t2                         t3                         t4
29   5500024017802120306174.H01 5500024017802120306174.G02 5500024017802120306174.E03 5500024017802120306174.D04
810  5500024030401071707292.H01 5500024030401071707292.G02 5500024030401071707292.E03 5500024030401071707292.D04
4693 5500024035736031208612.G08 5500024035736031208612.E09 5500024035736031208612.D10 5500024035736031208612.B11
                             t5                         t6
29   5500024017802120306174.B05 5500024017802120306174.A06
810  5500024030401071707292.B05 5500024030401071707292.A06
4693 5500024035736031208612.A12                       <NA>

I want the final outcome as:

> control.sub
                                 t1
    29   5500024017802120306174.H01 5500024017802120306174.G02 5500024017802120306174.E03 5500024017802120306174.D04
    810  5500024030401071707292.H01 5500024030401071707292.G02 5500024030401071707292.E03 5500024030401071707292.D04
    4693 5500024035736031208612.G08 5500024035736031208612.E09 5500024035736031208612.D10 5500024035736031208612.B11

       5500024017802120306174.B05 5500024017802120306174.A06
      5500024030401071707292.B05 5500024030401071707292.A06
     5500024035736031208612.A12

One columns (t1) containing all values.

user3253470
  • 191
  • 1
  • 4
  • 11
  • 1
    See `?paste` if you are trying to concatenate strings – C8H10N4O2 Aug 19 '15 at 14:42
  • paste is concatenating the columns I am giving, I don't want to concatenate, I want them in separate rows. : > control.sub$Mix <-paste(control.sub[,1], control.sub[,2],collapse=NULL, sep=" ") > control.sub$Mix [1] "5500024017802120306174.H01 5500024017802120306174.G02" "5500024030401071707292.H01 5500024030401071707292.G02" [3] "5500024035736031208612.G08 5500024035736031208612.E09" – user3253470 Aug 19 '15 at 14:51
  • you need to use `collapse=' '`, I will show you – C8H10N4O2 Aug 19 '15 at 14:59

2 Answers2

1

slightly more reproducible example:

df <- data.frame(t1 = c(letters[1:5],NA), t2 = c(NA, LETTERS[6:10]), 
                 t3 = c(11:12,NA,13:15), stringsAsFactors=FALSE)
df
#     t1   t2 t3
# 1    a <NA> 11
# 2    b    F 12
# 3    c    G NA
# 4    d    H 13
# 5    e    I 14
# 6 <NA>    J 15


df2 <- data.frame(t1 = apply(df, 1, function(s) paste(s[!is.na(s)], collapse=" ")) )

df2
#       t1
# 1   a 11
# 2 b F 12
# 3    c G
# 4 d H 13
# 5 e I 14
# 6   J 15

EDIT

I think the OP is looking for this, but his/her example is wrong:

unlist_not_na <- function(df){
  v <- unlist(df)
  v[!is.na(v)]
}
df3 <- data.frame(t1 = unlist_not_na(df))

df3
#     t1
# t11  a
# t12  b
# t13  c
# t14  d
# t15  e
# t22  F
# t23  G
# t24  H
# t25  I
# t26  J
# t31 11
# t32 12
# t34 13
# t35 14
# t36 15
Community
  • 1
  • 1
C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
  • Problem is here it is concatenating all columns value. for example for row1 generating 1 a 11, while i want one value in one row, (row1= 1, row2= a, row3 = 11 and so on...) – user3253470 Aug 19 '15 at 15:06
  • @user3253470 Your example is wrong then because the row numbers stay the same: 29, 810, 4693. – C8H10N4O2 Aug 19 '15 at 15:08
  • you are right there are only 3 rows but I want as many rows as many values. – user3253470 Aug 19 '15 at 15:14
  • @user3253470 see my edit -- but you should really work on asking the question well, and providing a minimally reproducible (and correct) example. – C8H10N4O2 Aug 19 '15 at 15:19
  • @user3253470 happy to help, please click the check mark next to the answer if you wish to accept it – C8H10N4O2 Aug 19 '15 at 15:22
  • If this is my control.sub: > control.sub t1 t2 t3 t4 t5 t6 507 EC2004030514AA NA 511 EC2004041408AA NA 569 EC2004070108AA NA Then after applying your proposed method, result I am getting is: > df3 t1 t11 37 t12 41 t13 55 whereas I want the result as : ) > df3 > t1 t11 EC2004030514AA t12 EC2004041408AA t13 EC2004070108AA Please guide me how I can get that ?? – user3253470 Aug 21 '15 at 10:02
  • Can you help me with this question: http://stackoverflow.com/questions/35484595/data-frame-merge-and-selection-of-values-which-are-common-in-2-data-frames – user3253470 Feb 18 '16 at 16:21
0

The following code works, but I don't know if anyone would consider it "optimal":

var <- as.vector(do.call('c',control.sub))

I would suggest going up higher in your code where you generate control.sub (if that is possible) and then manipulate the output format there.

If your variables are factors(you can check this by running:)

sapply(control.sub,class)

then you should first run:

controlsub<-lapply(control.sub,as.character)

EDIT: this is better:

 var <- unlist(control.sub)
Biebiep
  • 63
  • 8
  • probably it is returning me the indexes of the values. I want the values not the indexes. > var [1] 133 123 123 123 123 91 161 151 151 151 151 113 249 225 225 225 225 NA – user3253470 Aug 19 '15 at 15:02
  • make sure your columns aren't factors and try my new solution too. – Biebiep Aug 19 '15 at 15:07
  • Still the same problem: > var <- as.vector(do.call('c',control.sub)) > var [1] 133 161 249 123 151 225 123 151 225 123 151 225 123 151 225 91 113 NA – user3253470 Aug 19 '15 at 15:08
  • I said to make sure your variables weren't factors first. – Biebiep Aug 19 '15 at 15:09
  • I am sorry I don't know how to check this thing...! – user3253470 Aug 19 '15 at 15:11
  • sapply(control.sub,class). If they are factors, just make them all character first: control.sub<-sapply(control.sub,as.character) – Biebiep Aug 19 '15 at 15:13
  • after converting them into characters, I am getting the following error:> var <- as.vector(do.call('c',control.sub)) Error in do.call("c", control.sub) : second argument must be a list – user3253470 Aug 19 '15 at 15:17
  • just wrap control.sub in as.list() – Biebiep Aug 19 '15 at 15:20