0

below data set.

structure(list(sap_no = c(NA, 1910109125, 1910106937, 1910110360, 
1910110108, 1910107512, 1910108668, 1910108923, 1910110338, 1910110591
), qty = c(2, 1, 2, 1, 5, 1, 2, 1, 2, 2), date = structure(c(18001, 
18001, 18002, 18002, 18002, 18003, 18003, 18009, 18010, 18016
), class = "Date")), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

I tried to select date

clean_backorder$date<-as.Date(clean_backorder$date, origin = "1899-12-30")
with(clean_backorder,clean_backorder[date >= "2019-01-01" & date<="2019-12-31"])

it gave me error

Error in `vectbl_as_col_location()`:
! Must subset columns with a valid subscript vector.
ℹ Logical subscripts must match the size of the indexed input.
x Input has size 3 but subscript `date >= "2019-01-01" & date <= "2019-12-31"` has size 1525.

how can I fix this pls? btw when I run head function to the data frame, it looks like the date column is in date format?

> head(clean_backorder$date,10)
 [1] "2019-04-15" "2019-04-15" "2019-04-16" "2019-04-16" "2019-04-16" "2019-04-17" "2019-04-17" "2019-04-23"
 [9] "2019-04-24" "2019-04-30"
Bob
  • 13
  • 3
  • Does this answer your question? [Filtering dates in dplyr](https://stackoverflow.com/questions/34232998/filtering-dates-in-dplyr) – Luuk Apr 27 '22 at 12:12

1 Answers1

0

You have some code issues regarding the way to tell R to select rows in the column date that matches conditions. Here is a base only way:

clean_backorder[with(clean_backorder, date >= as.Date("2019-01-01") & date <= as.Date("2019-12-31")),]

Output:

# A tibble: 10 x 3
       sap_no   qty date      
        <dbl> <dbl> <date>    
 1         NA     2 2019-04-15
 2 1910109125     1 2019-04-15
 3 1910106937     2 2019-04-16
 4 1910110360     1 2019-04-16
 5 1910110108     5 2019-04-16
 6 1910107512     1 2019-04-17
 7 1910108668     2 2019-04-17
 8 1910108923     1 2019-04-23
 9 1910110338     2 2019-04-24
10 1910110591     2 2019-04-30

What is happening?

  1. You want to select rows, so you need to put the conditions onthe left of the comma: clean_backorder[****,]
  2. The **** should be a vector of TRUE/FALSE for each row of clean_backorder that match a condition on the date column: clean_backorder[with(clean_backorder, condition on the date column),]
  3. You need to convert your character vector "2019-01-01", etc. to a Date format using as.Date() so you can filter a Date column: clean_backorder[with(clean_backorder, date >= as.Date("2019-01-01") & date <= as.Date("2019-12-31")),]

As suggested in the comments, the tidyverse packages provide useful functions to make this code more readible:

clean_backorder |> # or %>% if you prefer to use magrittr pipe 
  dplyr::filter(date >= as.Date("2019-01-01") & date <= as.Date("2019-12-31"))

or

clean_backorder |> # or %>% if you prefer to use magrittr pipe 
  dplyr::filter(dplyr::between(date, as.Date("2019-01-01"), as.Date("2019-12-31")))
Paul
  • 2,850
  • 1
  • 12
  • 37