0

I'm new to R and programming and taking a Coursera course. I've asked in their forums, but nobody can seem to provide an answer in the forums. To be clear, I'm trying to determine why this does not output.

When I first wrote the program, I was getting accurate outputs, but after I tried to upload, something went wonky. Rather than producing any output with [1], [2], etc. when I run the program from RStudio, I only get the the blue +++, but no errors and anything I change still does not produce an output.

I tried with a previous version of R, and reinstalled the most recent version 3.2.1 for Windows.

What I've done: Set the correct working directory through RStudio

pol <- function(directory, pol, id = 1:332) {
    files <- list.files("specdata", full.names = TRUE); 
    data <- data.frame();

for (i in ID) {
    data <- rbind(data, read.csv(files_list[i]))
}

    subset <- subset(data, ID %in% id);
    polmean <- mean(subset[pol], na.rm = TRUE);
    polmean("specdata", "sulfate", 1:10)
    polmean("specdata", "nitrate", 70:72)
    polmean("specdata", "nitrate", 23)

}

Can someone please provide some direction - debug help?

when I adjust the code the following errors tend to appear:

  • ID not found
  • Missing or unexpected } (although I've matched them all).

The updated code is as follow, if I'm understanding:

data <- data.frame();
files <- files[grepl(".csv",files)]

pollutantmean <- function(directory, pollutant, id = 1:332) {
    pollutantmean <- mean(subset1[[pollutant]], na.rm = TRUE);
}
romod13
  • 11
  • 1
  • It appears that `directory` (the first argument of your function) isn't named in the body of the function at all. – Chris Watson Jul 19 '15 at 22:03
  • Additionally, in your `for` loop, the object `files_list` doesn't seem to exist. I think you want to change that to `files`. – Chris Watson Jul 19 '15 at 22:06
  • Thanks - but If I change "specdata" to "directory, then I get an error of "directory" object not found. You're right about the files_list. I updated that and moved the data.frame() to before the files line. However, I still don't get anything but +++'s – romod13 Jul 19 '15 at 23:38
  • There's website for the Coursera courses that you are supposed to be using. – IRTFM Jul 20 '15 at 05:01
  • Thanks @BondedDust - I've used/ am using those forums, but they aren't really providing a path to the solution either. – romod13 Jul 20 '15 at 11:38

2 Answers2

0

Looks like you haven't declared what ID is (I assume: a vector of numbers)?

Also, using 'subset' as a variable name while it's also a function, and pol as both a function name and the name of one of the arguments of that same function is just asking for trouble...

And I think there is a missing ")" in your for-loop.

EDIT

So the way I understand it now, you want to do a couple of things.

  1. Read in a bunch of files, which you'll use multiple times without changing them.
  2. Get some mean value out of those files, under different conditions.

Here's how I would do it.

  1. Since you only want to read in the data once, you don't really need a function to do this (you can have one, but I think it's overkill for now). You correctly have code that makes a vector with the file names, and then loop over over them, rbinding them to each other. The problem is that this can become very slow. Check here. Make sure your directory only contains files that you want to read in, so no Rscripts or other stuff. A way (not 100% foolproof) to do this is using files <- files[grepl(".csv",files)], which makes sure you only have the csv's (grepl checks whether a certain string is a substring of another, and returns a boolean the [] then only keeps the elements for which a TRUE was returned).

  2. Next, there is 'a thing you want to do multiple times', namely getting out mean values. This is where you'd use a function. Apparently you want to get the mean for different types of pollution, and you want this in restricted IDs. Let's assume that 1. has given you a dataframe df with a column named Type for the type of pollution and a column called Id that somehow represents a sort of ID (substitute with the actual names in your script - if you don't have a column for ID, I'll edit the answer later on). Now you want a function

    polmean <- function(type, id) { 
       # some code that returns the mean of a restricted version of df 
    }
    

This is all you need. You write the code that generates df, you then write a function that will get you what you want from that dataframe, and then you call it for the circumstances you want to use it in (the three polmean calls at the end of your original code, but now without the first argument as you no longer need this).

Community
  • 1
  • 1
liesb
  • 53
  • 7
  • Yes, it's a vector of numbers. We're supposed to read a folder of 300+ .csv's. How do you declare the ID? – romod13 Jul 19 '15 at 23:13
  • I guess something like `ID <- 1:length(files)`, after the line starting with `files <-` – liesb Jul 19 '15 at 23:47
  • It's strange. Even if I try that, the code is accepted and doesn't throw any errors. I continue to get just +++ whatever I do, unless it's outside of the first {} – romod13 Jul 20 '15 at 00:06
  • I run Rstudio on windows as well (v0.98, no idea what the newest is) and on Linux (which should be a newer version), and I've never seen the '+++' that you're talking about. Is this just the indicator that you can start a new command? (In my version of Rstudio, that's a right carret: >). Do you actually call the function? Just defining it won't do anything. – liesb Jul 20 '15 at 00:16
  • Now that I'm looking at it again, I think you you're trying to define a function that's called polmean, and that the last three lines is you calling the function, which should happen outside the function. Please confirm? – liesb Jul 20 '15 at 00:33
  • Perhaps, but I'm not sure. The last three lines are supposed to supply the outputs. Here's what I get when I put them outsid the function. – romod13 Jul 20 '15 at 00:51
  • > pollutantmean("specdata", "sulfate", 1:10) Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file 'NA': No such file or directory Called from: file(file, "rt") Browse[1]> pollutantmean("specdata", "nitrate", 70:72) Error during wrapup: cannot open the connection Browse[1]> pollutantmean("specdata", "nitrate", 23) Error during wrapup: cannot open the connection Browse[1]> Q – romod13 Jul 20 '15 at 00:53
  • To clarify, I found out and understand that the last three lines should not be included. This should be the function only. – romod13 Jul 20 '15 at 11:18
  • Did you try to implement what I laid out in the edit? Are you still getting errors? – liesb Jul 20 '15 at 11:30
  • Ok - I tried to follow your instructions, but it may have confused me more as I'm just learning functions. I did the following. 1. Created files <- using the grep1 you suggested. 2. Created an empty data.frame() 3. used the original first line pollutantmean <- function(directory, pollutant, id = 1:332) { #there are 2 types and an ID column. 4. created the mean. Output of that is still blue code ++++ – romod13 Jul 20 '15 at 12:04
  • Can you maybe add an edit with your new code to the original post? – liesb Jul 20 '15 at 12:22
  • Hi @romod13, if this answer has solved your question please consider [accepting it](http://meta.stackexchange.com/q/5234/179419) by clicking the check-mark. This indicates to the wider community that you've found a solution and gives some reputation to both the answerer and yourself. There is no obligation to do this. – liesb Jul 21 '15 at 07:21
0

Ok - I finally solved this. Thanks for the help.

  1. I didn't need to call "specdata" in line 2. the directory in line 1 referred to the correct directory.
  2. My for/in statement needed to refer the the id in the first line not the ID in the dataset. The for/in statement doesn't appear to need to be indented (but it looks cleaner)
  3. I did not need a subset
  4. The last 3 lines for pollutantmean did not need to be a part of the program. These are used in the R console to call the results one by one.
romod13
  • 11
  • 1