0

I am trying to filter values within a variable in a dataset called brazilcorona.

This dataset contains a variable called data, which contains the date (year/month/day) that the covid infection ocurred, for example 2020-03-14 (14th march of 2020).

Specifically, I am trying to create a new vector and eliminate all the dates that are previous to 2020-05-15 (15th may of 2020).

In order to to so, I tried the following code newdata <- filter (Brazilcorona, data > 2020-05-15)

When I try to run the code above, R shows the error not meaningful for factors

In other words, I would like to create a new vector that has only dates after may 15th, that is, occurs after 2020-05-15.

Could someone help me? Thanks

1 Answers1

0

Convert the date to date class and compare the dates

library(dplyr)

new_data <- brazilcorona %>% 
              mutate(data = as.Date(data)) %>% 
              filter(data > as.Date('2020-05-15'))

In base R :

new_data <- subset(transform(brazilcorona, data = as.Date(data)), 
                   data > as.Date('2020-05-15'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for your contribution, Ronak. When I run those codes, I am able to create a new vector. However, R shows me "No data available in table", that is, ```0 obs```. Do you know what i am doing wrong? I simply copied and pasted the code above... Tks – Andre Masuko May 19 '20 at 02:43
  • What does `dim(new_data)` return? – Ronak Shah May 19 '20 at 02:51
  • It rerturns the following : ```[1] 0 14``` – Andre Masuko May 19 '20 at 03:00
  • Can you copy and paste the output of `dput(head(Brazilcorona, 2))` in your post? – Ronak Shah May 19 '20 at 03:01
  • alright, done :) – Andre Masuko May 19 '20 at 03:08
  • Your `dput` gives me an error but I think you are not using correct dataframe name and column name. It looks like your dataframe name is `brazilcorona` and date column is called `data`. Please make these adjustments in the above code, don't directly copy-paste. . – Ronak Shah May 19 '20 at 03:10
  • i edited it. The variable is called ```data``` and the dataset is ```brazilcorona```. Even tough I made those adjustments, the code you passed persists with the ```0 obs error```. If it may help, the csv data I am using is https://raw.githubusercontent.com/umbertomig/intro-prob-stat-FGV/master/datasets/brazilcorona.csv – Andre Masuko May 19 '20 at 03:12
  • @Andre well, the data has no observation after 15th May. The last observation is of 15th May hence you get 0 rows. If you change the last line to `filter(data >= as.Date('2020-05-15'))` you'll get data for 15th May. – Ronak Shah May 19 '20 at 03:20
  • 1
    I really appreciate your help, Ronak. It is running fine now. Thank you, really. – Andre Masuko May 19 '20 at 03:27
  • Hi Ronak. I just sent you an inbox message in twitter. When you are free, could you please check it ? Tks! – Andre Masuko Jun 19 '20 at 21:36