1

Scanario.

In the datacamp courses, cleaning data with R: case studies. There is an excercise at the extreme end of the course where we have 5 columns (say: 1,2,3,4,5) of dataset "att5". Only column 1 is char & has characters in it but 2:5 has numbers but it is type(chars). They tell me to make a vector cols consisting of vectors which has indices of (2,3,4,5) and use sapply to use as.numeric function on them.

My solution is not working although it is making sense. I'm sharing my their solutions first and then my solutions. Please help me understand what is going on.

Data Camp Solution(working)

# Define vector containing numerical columns: cols
cols <- -1

# Use sapply to coerce cols to numeric
att5[, cols] <- sapply(att5[, cols], as.numeric)

My Solution(not working)

# Define vector containing numerical columns: cols
cols <- c(2:5)

# Use sapply to coerce cols to numeric
att5[, cols] <- sapply(att5[, cols], as.numeric)

I'm getting this error: invalid subscript type list

Help me understand. Newbie in R.

1 Answers1

0

Your solution works perfectly on my machine. The only difference I can be able to see is cols <- -1 is of class "numeric" where as cols <- c(2:5) is [1] "integer". If you want to know the difference between the two have a look What's the difference between integer class and numeric class in R.

So, one way to reverse-engineer their solution is to generate cols in numeric class and seq can help do that.

cols <- seq(2,5,1)
#class(cols)
#[1] "numeric"
att5[, cols] <- sapply(att5[, cols], as.numeric)
# str(att5)
# 'data.frame': 5 obs. of  5 variables:
# $ att1: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5
# $ att2: num  1 2 3 4 5
# $ att3: num  1 2 3 4 5
# $ att4: num  1 2 3 4 5
# $ att5: num  1 2 3 4 5

Data

dput(att5)
att5 <- structure(list(att1 = structure(1:5, .Label = c("A", "B", "C", 
"D", "E"), class = "factor"), att2 = structure(1:5, .Label = c("1", 
"2", "3", "4", "5"), class = "factor"), att3 = structure(1:5, .Label = c("1", 
"2", "3", "4", "5"), class = "factor"), att4 = structure(1:5, .Label = c("1", 
"2", "3", "4", "5"), class = "factor"), att5 = structure(1:5, .Label = c("1", 
"2", "3", "4", "5"), class = "factor")), class = "data.frame", row.names = c(NA, 
-5L))

Hope it works on your end.

Community
  • 1
  • 1
deepseefan
  • 3,701
  • 3
  • 18
  • 31