-1

Please don't mind if this question is very basic. I am in the learning process of R.

I have a pooled dataset for summer months for 2000-2010. I want to exclude 9 days below 9 degree celsius from all the data.

Update

I have temperature values in my dataset as;

 [1]  9.4 10.2 11.2 12.4 12.6 13.1 13.8 14.3 12.1 10.3 11.0 10.6  9.6 10.5 13.2 14.8
[17] 14.4 15.3 15.9 14.8 14.1 15.0 18.0 19.8 19.9 18.2 16.2 16.2 17.9 19.3 19.4 18.7
[33] 18.5 21.1 23.2 22.7 22.4 22.5 22.6 21.3 19.9 19.5 18.4 17.7 18.3 20.2 21.6 22.0

I want to delete all 9 values previous to 9-degree celsius at each point in my dataset.

I was suggested to use this script:

cleandata<-workdata[-sample(which(workdata$tempd0d1 < 9), 9), ]

I used but it's not working. I am getting same values for tempd0d1 (temperature variable).

cleandata$tempd0d1
1]  9.4 10.2 11.2 12.4 12.6 13.1 13.8 14.3 12.1 10.3 11.0 10.6  9.6 10.5 13.2 14.8
[17] 14.4 15.3 15.9 14.8 14.1 15.0 18.0 19.8 19.9 18.2 16.2 16.2 17.9 19.3 19.4 18.7
[33] 18.5 21.1 23.2 22.7 22.4 22.5 22.6 21.3 19.9 19.5 18.4 17.7 18.3 20.2 21.6 22.0

any quick help would be appreciated.

  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. I think that the answer here is likely to use `group_by` & `mutate` to find the maximum/average value per day and then `filter`. But it's hard to say without data. – JBGruber Apr 27 '20 at 09:59

2 Answers2

0

In case you just want to delete them, use:

datasetClean <- dataset[dataset$temperature > 9,]

It is crutial, that you have first casted your Temperature column into a numeric data-format. Just like

dataset$temperature <- as.numeric(dataset$temperature)`

Hope that helps with your problem.

Helge
  • 3
  • 3
  • I want to exclude only 10 days below temperature 9, not all the days below temperature 9. How can i figure out to delete only 10 days below temperature 9? – hasan sohail Apr 27 '20 at 10:28
0

Maybe this might help. You can do it without any additional packages.

You can sample 9 days from a subset where the temp < 9, and then remove those rows:

set.seed(123)

df[-sample(which(df$temp < 9), 9), ]

Output

 [1] 15 15 14 10  2  6 11 14  9 10 11  3 11  9 12  9  9 13  3  8 10 10  9 14  4 14  1 11  7  5 12
[32] 15 10 13  9  9 10  7 11 12  5

Data

df <- structure(list(temp = c(15L, 15L, 3L, 14L, 3L, 10L, 2L, 6L, 11L, 
5L, 4L, 14L, 6L, 9L, 10L, 11L, 5L, 3L, 11L, 9L, 12L, 9L, 9L, 
13L, 3L, 8L, 10L, 7L, 10L, 9L, 14L, 3L, 4L, 14L, 1L, 11L, 7L, 
5L, 12L, 15L, 10L, 13L, 7L, 9L, 9L, 10L, 7L, 11L, 12L, 5L)), class = "data.frame", row.names = c(NA, 
-50L))

df$temp
[1] 15 15  3 14  3 10  2  6 11  5  4 14  6  9 10 11  5  3 11  9 12  9  9 13  3  8 10  7 10  9 14
[32] 3  4 14  1 11  7  5 12 15 10 13  7  9  9 10  7 11 12  5
Ben
  • 28,684
  • 5
  • 23
  • 45
  • Thanks. This is not helping in my case. I tried df[-sample(which(df$temp < 9), 9), ], but it gives me whole dataset as output. It does not shows me the rows which have temperature less than 9. Can you please elaborate more what could be the problem – hasan sohail Apr 27 '20 at 13:29
  • Please clarify what you want in the end. Did you want two data frames - one with the 9 temps excluded, and the second data frame of the 9 temps that were excluded? Did you want a single data frame, but with a column indicating which rows would be excluded? Did you just want a data frame of the excluded rows? More clarification is needed. – Ben Apr 27 '20 at 14:31
  • Ben, I want to delete 9 days previous to 9-degree celsius from my original dataset. Will df[-sample(which(df$temp < 9), 9), ], do the job? Can you please tell me where i can read more about such sort of subsetting? – hasan sohail Apr 28 '20 at 10:02
  • @HasanSohail I'm very, very confused. I highly recommend you read about [how to make a great reproducible example in R](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). I don't understand what your original dataset looks like. I also don't understand what "delete 9 days previous to 9-degree celsius" means. And I don't know what your desired final outcome should look like. You will get much more help if you are able to provide a more detailed question on stackoverflow. You can edit your question above with more information. – Ben Apr 28 '20 at 12:09
  • There is a lot out there on subsetting in R. One of my favorite resources is [here](https://adv-r.hadley.nz/subsetting.html). – Ben Apr 28 '20 at 12:12