0

I have the following piece of code.it is too slow right now. How can I rewrite it to improve speed? (in vectorized form , using apply functions or any other form)

my dataframe is called urban.

bcolumn.pattern <- '^b[0123456789][0123456789]'
bcolumn.index = grep(bcolumn.pattern, names(urban))
bcolumn.nrow <- dim(urban)[1]

for (k in bcolumn.index){
for (l in( 1 :bcolumn.nrow))
if (    is.nan(urban [l, ][ ,k])    )     
{urban [l, ][ ,k] <- 0 }
Hamideh
  • 665
  • 2
  • 8
  • 20
  • If you want us help optimize, we'll need a reproducible example. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Mar 28 '14 at 10:13
  • Your subsetting is confusing. Why don't you simply use `urban[l, k]`? That would be easier to read and faster, too. But see droopy's answer for better solutions. – Roland Mar 28 '14 at 10:32

1 Answers1

2

one possibility :

# if urban is a data.frame
urban[,bcolumn.index] <- lapply(urban[,bcolumn.index], function(x) {x[is.nan(x)] <- 0; x})
# if urban is a matrix
urban[,column.index][is.nan(urban[, column.index])] <- 0
droopy
  • 2,788
  • 1
  • 14
  • 12
  • thanks @droopy but it did not work out. Are you sure about lapply? because urban[,bcolumn.index] is a dataframe not a list. – Hamideh Mar 28 '14 at 11:25
  • with this example it works : x <- as.matrix(matrix(1:12,4)) x[1,2] <- NaN x <- as.data.frame(x) x[,1:2] <- lapply(x[,1:2], function(u) {u[is.nan(u)] <- 0; u}). It could depend of bcolumn.index. if it contains only one column then you should add drop=FALSE everywhere : urban[,bcolum.index,drop=F] <- lapply(urban[,bcoluln.index,drop=F], ...). A data.frame is a kind of list. Try is.list(urban). – droopy Mar 28 '14 at 13:07