I'm finding it helpful to try tackling problems posed by others as a means of learning more about R. Just recently a question was posed on StackOverflow and I've run into a bit of a problem with trying to implement my solution.
The first thing I did was re-create the data frame, naming it overflow
:
> Person <- c("Sally", "Bill", "Rob", "Sue", "Alex", "Bob")
> Movie <- c("Titanic", "Titanic", "Titanic", "Cars", "Cars", "Cars")
> Rating <- c(4, 4, 4, 8, 9, 8)
> overflow <- data.frame(Person=Person, Movie=Movie, Rating=Rating)
Which gave me a reproducible example from which to work:
> overflow
Person Movie Rating
1 Sally Titanic 4
2 Bill Titanic 4
3 Rob Titanic 4
4 Sue Cars 8
5 Alex Cars 9
6 Bob Cars 8
Next I wanted to design a function that would flag the rows in which the ratings were inconsistent. So in the question posed, each movie should have a consistent rating, meaning the row with Alex is problematic.
As a caveat, there are refined ways of doing so in the answers to the question, but before delving into those I'd like to solve it on my own so I can learn more about writing functions.
Here's what I did:
check <- function(x) {
baddies <- numeric()
for (i in 1:nrow(x)) {
if (x$Movie[i] == x$Movie[i + 1] & x$Rating[i] != x$Rating[i + 1]) {
append(baddies, i)
}
}
}
My goal is create a function named check()
that will iterate through all the rows in a specified data frame, checking for instances in which the movies are the same but the ratings are different. Once it finds these it will flag the row number by appending it to a vector named baddies
.
However, when I run my function:
> check(overflow)
I receive an error message:
Error in if (x$Movie[i] == x$Movie[i + 1] & x$Rating[i] != x$Rating[i + : missing value where TRUE/FALSE needed
I've thought through the error message and revisited the logical operators in the function, but I can't seem to figure out where I'm going wrong.
Can anyone give me a bump in the right direction here? Thanks in advance.