11

I have a matrix, data.mat, that looks like:

A B C D E  
45 43 45 65 23   
12 45 56 NA NA   
13 4  34 12 NA  

I am trying to turn this into a list of lists, where each row is one list within a bigger list. I do the following:

list <- tapply(data.mat,rep(1:nrow(data.mat),ncol(data.mat)),function(i)i)

which gives me a list of lists, with NAs included, such as:

$`1`  
 [1]  45 43 45 65 23  
$`2`  
 [1]  12 45 56 NA NA  
$`3`  
 [1]  13 4 34 12 NA  

But what I want is:

$`1`  
 [1]  45 43 45 65 23  
$`2`  
 [1]  12 45 56   
$`3`  
 [1]  13 4 34 12   

Is there a good way to remove the NAs either during the tapply call or after the fact?

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Amberopolis
  • 445
  • 1
  • 6
  • 16
  • 2
    Don't use variable names like `list` as it's used in R for lists. – Brouwer Sep 11 '14 at 00:43
  • Good point. I wouldn't normally--I was just trying to make it generic for the example. But that is good to remember, since I'm sloppy about naming conventions sometimes. – Amberopolis Sep 11 '14 at 02:13

3 Answers3

26

Sure, you can use lapply like this:

> lapply(list, function(x) x[!is.na(x)])
$`1`
[1] 45 43 45 65 23

$`2`
[1] 12 45 56

$`3`
[1] 13  4 34 12
rsoren
  • 4,036
  • 3
  • 26
  • 37
9

Your sample data:

data.mat <- data.matrix(read.table(text = "A B C D E  
45 43 45 65 23   
12 45 56 NA NA   
13 4  34 12 NA ", header = TRUE))

To split by row:

row.list <- split(data.mat, row(data.mat))

To remove NAs:

Map(Filter, list(Negate(is.na)), row.list)

or

lapply(row.list, Filter, f = Negate(is.na))

Everything in one shot:

Map(Filter, list(Negate(is.na)), split(data.mat, row(data.mat)))
flodel
  • 87,577
  • 21
  • 185
  • 223
  • The benefit of this method is it outputs the same `list` format even if there are no `NA`s in the data. – tospig Feb 21 '15 at 03:36
4

You could do this:

apply(data.mat, 1, function(x) x[!is.na(x)])

Output:

[[1]]
 A  B  C  D  E 
45 43 45 65 23 

[[2]]
 A  B  C 
12 45 56 

[[3]]
 A  B  C  D 
13  4 34 12

If you don't want names:

apply(data.mat, 1, function(x) unname(x[!is.na(x)]))

If there is the possibility that every row has the same number of NAs, it will be safer to use:

split(apply(data.mat, 1, function(x) unname(x[!is.na(x)])), 1:nrow(data.mat))
Jota
  • 17,281
  • 7
  • 63
  • 93
  • Just beat me to it :) – rsoren Sep 11 '14 at 00:40
  • 1
    `lapply(split(data.mat, 1:nrow(data.mat)), function(x) unname(x[!is.na(x)]))` is a bit more identical to the "desired" output, but both contain the same info. – MrFlick Sep 11 '14 at 00:40
  • @MrFlick True. I feel no shame in taking the `unname` part to match that aspect better, but you may want to post yours as a separate answer. Or I could add it to mine if you aren't inclined. I have no shame. ;) – Jota Sep 11 '14 at 00:51
  • Personally, i think your specific shape is probably more useful and representative of the real data. I'm fine leaving it as a comment. – MrFlick Sep 11 '14 at 00:53
  • A problem with this approach is that it won't consistently return a list. If the number of NA was the same on each row... – flodel Sep 11 '14 at 01:07
  • @flodel Thanks for point this out. I added a fix for that case. – Jota Sep 11 '14 at 01:34
  • without the `x` before `[!is.na(x)]` I got an error (unexpected `[`...). Should this in fact be there, so you have `x[!is.na(x)]` ? (I also see you deleted it in one of your edits). – tospig Feb 19 '15 at 08:19
  • 1
    @tospig Yes, you need the `x` in front there. I'm not entirely sure what happened with the edits. I must have made a mistake. – Jota Feb 20 '15 at 03:08