2

I'm finding it helpful to try tackling problems posed by others as a means of learning more about R. Just recently a question was posed on StackOverflow and I've run into a bit of a problem with trying to implement my solution.

The first thing I did was re-create the data frame, naming it overflow:

> Person <- c("Sally", "Bill", "Rob", "Sue", "Alex", "Bob")
> Movie <- c("Titanic", "Titanic", "Titanic", "Cars", "Cars", "Cars")
> Rating <- c(4, 4, 4, 8, 9, 8)
> overflow <- data.frame(Person=Person, Movie=Movie, Rating=Rating)

Which gave me a reproducible example from which to work:

> overflow
  Person   Movie Rating
1  Sally Titanic      4
2   Bill Titanic      4
3    Rob Titanic      4
4    Sue    Cars      8
5   Alex    Cars      9
6    Bob    Cars      8

Next I wanted to design a function that would flag the rows in which the ratings were inconsistent. So in the question posed, each movie should have a consistent rating, meaning the row with Alex is problematic.

As a caveat, there are refined ways of doing so in the answers to the question, but before delving into those I'd like to solve it on my own so I can learn more about writing functions.

Here's what I did:

check <- function(x) {
  baddies <- numeric()
  for (i in 1:nrow(x)) {
    if (x$Movie[i] == x$Movie[i + 1] & x$Rating[i] != x$Rating[i + 1]) {
        append(baddies, i)
    }
  }
}

My goal is create a function named check() that will iterate through all the rows in a specified data frame, checking for instances in which the movies are the same but the ratings are different. Once it finds these it will flag the row number by appending it to a vector named baddies.

However, when I run my function:

> check(overflow)

I receive an error message:

Error in if (x$Movie[i] == x$Movie[i + 1] & x$Rating[i] != x$Rating[i + : missing value where TRUE/FALSE needed

I've thought through the error message and revisited the logical operators in the function, but I can't seem to figure out where I'm going wrong.

Can anyone give me a bump in the right direction here? Thanks in advance.

Community
  • 1
  • 1
  • 2
    Typical "aw-bleep I should have noticed" here: your loop is `1:nrow` but you then call for `x$Rating[i+1]` , so you've gone past the end of the array. – Carl Witthoft Dec 13 '14 at 21:28
  • 2
    Some general advice in addition to Carl's comment. Read the [R inferno](http://www.burns-stat.com/pages/Tutor/R_inferno.pdf) and then remove `append` from the set of functions you use (at least for now) and learn to pre-allocate instead (or even better yet, learn to use `*apply` functions instead of `for` loops if you need a return value). Also learn about debugging functions such as `browser` (which is used by the breakpoints feature in RStudio). – Roland Dec 13 '14 at 21:32
  • @CarlWitthoft That's a great catch, thank you! –  Dec 13 '14 at 21:37
  • @Roland Thanks for the advice. I bought a book that was highly recommended called The Art of R Programming as well. I'll plan to read both, but do you have a recommendation for which I should tackle first? –  Dec 13 '14 at 21:39
  • 4
    Before you read any books, you should read the official introduction ("R-intro") and the R inferno. – Roland Dec 13 '14 at 21:56

1 Answers1

2

You were close, just a couple of minor errors. First, as @Carl Witthoft pointed out, your index range was 1 too large, resulting in the error. Additionally, instead of append(baddies,1), you want to use baddies <- append(baddies,1), otherwise the object is not being modified. Finally, make sure to return baddies at the end of your function, done implicitly below:

check <- function(x) {
  baddies <- numeric()
  for (i in 1:(nrow(x)-1)) {
    if (x$Movie[i] == x$Movie[i + 1] & x$Rating[i] != x$Rating[i + 1]) {
      baddies <- append(baddies, i)
    }
  }
  baddies
}
##
> check(overflow)
[1] 4 5
nrussell
  • 18,382
  • 4
  • 47
  • 60
  • 1
    Thanks for the answer and walking me through which parts of my function needed attention. It was really helpful and I appreciate your time. –  Dec 13 '14 at 21:40