R trouble subsetting a dataset

Question

I have some code

file = "http://dd.weather.gc.ca/hydrometric/csv/SK/hourly/SK_hourly_hydrometric.csv"
skdat2 <- read.csv(file, head=T, sep=",", dec=".")
colnames(skdat2) <- c("ID", "Date", "Water.Level", "Grade.1", "Symbol.1", 
                      "QA/QC-1", "Discharge/Debit", "Grade.2", "Symbol.2", 
                      "QA/QC-2")

skdat22 <- subset(skdat2, skdat2$ID=='05AH050')

plot(skdat22$ID ~ skdat22$Date, skdat22, xaxt = "n")

What I am trying to do is plot just the data that has ID 05AH050 and when I display the dataset skdat22 it shows only ID 05AH050 data

however whan I plot it I get all the other Ids as well? Am I misunderstanding the subset command?

Try adding "stringsAsFactors=FASLE" to your read.csv call. Also, you don't need to use `subset()`, just do `skdat22<-skdat2[skdat2$ID=="05AH050",]`. — iod, Aug 30 '18 at 03:40
The issue, as @doviod points out, is that when the CSV was read into R, `ID` became a factor, not a character. So the factor _levels_ are still present in the subset. Change your read.csv to `read.csv(file, header = TRUE, stringsAsFactors = FALSE)`. You don't need the `sep` or `dec` arguments if the defaults are satisfied; it's better to use TRUE not T and also, consider using variable names that are not command names (such as `file`). — neilfws, Aug 30 '18 at 03:46
`ID` should really remain a factor if it's a grouping (categorical) variable. You can use `skdat22 <- droplevels(skdat22)` to drop unused factor levels. — Rich Scriven, Aug 30 '18 at 04:11
Looks Like I have some options to try thanks all that is all good info! — R.Merritt, Aug 30 '18 at 04:59

R trouble subsetting a dataset

0 Answers0