0

I am fairly new to R and can't get my head around "apply" family functions. Please help in resolving this query. I have run the following code:

attach(airquality)
mydata<-airquality
col_names1<-names(mydata)
mydata[,col_names1]<-lapply(mydata[,col_names1],factor)
str(mydata)
col_names2<-names(mydata)
mydata[,col_names2]<-sapply(mydata[,col_names2],factor)
str(mydata)

I see that lapply converts all numeric variables into factors but sapply does not do that. Why is it so? Please do throw light on it.

  • 1
    `sapply` returns a matrix, which can't contain values of mode factor, so the values are coerced to character. If you do `class(sapply(mydata[,col_names2],factor))` you'll see that `sapply` is returning a matrix. However, by assigning it back to the columns of your data frame, you end up with a data frame of character values. – eipi10 May 13 '16 at 16:13
  • 1
    A few other things: (1) don't use `attach`. It will come back to bite you if you work with multiple data frames. (2) you attach the `airquality` data frame, but you're working with `mydata`, which isn't attached. (3) if you're running `sapply` or `lapply` on the whole data frame, you don't need to refer to specific columns. You can just do `mydata <- sapply(mydata, ...)`. (4) after running `lapply` on `mydata`, you've converted all the columns of `mydata` to factor class. If you want to run `sapply` on `mydata` in its original form, you need to re-run `mydata<-airquality`. – eipi10 May 13 '16 at 16:18
  • 1
    sapply is lapply with some preset defaults. you can _always_ use sapply – rawr May 13 '16 at 16:47
  • 1
    A great set of tutorials is available IN R STUDIO using the package swirl.. `install.packages("swirl")` then type `require(swirl)` then `swirl()` ...it will ask a few questions and get you to a list of turorials...go through the basics of R tutorials, there are like 14 of them right in the middle is a series of interactive ones on `lapply` `tapply`, `sapply` `mapply` and `apply` – sconfluentus May 13 '16 at 20:32
  • Thank You a lot eipi10 and bethanyP. I have got it now. @eipi10: Please do elaborate on why we should not use 'attach' whilst working with multiple data frames. – Venugopal Bukkala May 15 '16 at 19:08
  • 1
    Regarding `attach`, see [here](http://sas-and-r.blogspot.com/2011/05/to-attach-or-not-attach-that-is.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+SASandR+%28SAS+and+R%29), [here](http://stackoverflow.com/questions/10067680/why-is-it-not-advisable-to-use-attach-in-r-and-what-should-i-use-instead), and [here](https://books.google.com/books?id=S04BBAAAQBAJ&pg=PA76&lpg=PA76&dq=R+don%27t+use+attach&source=bl&ots=8v-kqeVLG6&sig=9EJjRQZyhSy4LgKNYe1-vWEsrnA&hl=en&sa=X&ved=0ahUKEwi0mrvl5dzMAhVC9GMKHTrWCRs4ChDoAQghMAE#v=onepage&q=R%20don%27t%20use%20attach&f=false). – eipi10 May 15 '16 at 19:18

0 Answers0