2

Thanks for all your advice. My remaining question is this: Can I replace column name 'sulphate' in the following statement ... dataclean <- datatable$sulfate[!datanas] .... with a reference to a parameter 'pollutant', which may or may not have a value of 'sulfate'?

Jon M
  • 21
  • 2
  • Try `datanas <- is.na(datatable[,pollutant])` for the third line. – Adam Quek May 23 '17 at 07:17
  • And what do you mean trying to retrieve parameter values? – Adam Quek May 23 '17 at 07:18
  • You need to use strings. Define your function arguments as strings (in " ") and then use the string directly in `setwd()` and in the data table use `datatable[, pollutant]`. – LAP May 23 '17 at 07:18
  • Where do you get your `datatable` from? You have to read it, e.g. with `read.table(...)` (or similar function) or you have to load it from a RData file. Similar question (at least with the same parameters): https://stackoverflow.com/questions/29018579/r-code-pollutant-mean-is-producing-nans – jogo May 23 '17 at 07:18

1 Answers1

1

When you attach values to arguments, they appear as they would be objects in your workspace. But the environment is not workspace but that of the function.

So in your case, directory would be a character string and it would work. For the first time. Your working directory is now changed and you need to revert back to the previous one for the function to work again. This can get pretty messy so what I like to do is just refer to raw files by full path. See ?list.files for more info.

For your second question, your best bet is to refer to a certain level within the variable, is to do

x[, pollutant]

It is convenient to add drop = FALSE argument there, in order to keep the what I'm assuming is a data.frame.

You could improve your function by also implementing the datatable argument. That way you have all the objects bundled together nicely.

The most important thing to note here would be "debugging". You should learn to use at least browser(). This function will stop the execution of your function at the very step where it was called. This enables you, in the R console, to inspect elements in the function and run code to see what's going. This way you can speed up the development of code, at least initially when you usually haven't internalized all the data structures and paradigms yet.

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • Thanks Leo P - you were right, I had simply omitted to include the quotes in the function call. Sorry! – Jon M May 23 '17 at 09:03
  • And Jogo - thanks for the link - that is indeed the same problem but has been solved by using an If statement, which only works because there are only two possibilities for the variable pollutant. However, I can't believe there isn't a more generic solution. Is essence, I just need to replace all instances of nitrate by a reference to the parameter pollutant. Is it possible? - I will update the post with full code. – Jon M May 23 '17 at 09:06