0

I have a dataset containing sales quantities. There are decimal numbers in sales numbers, but they should not be. So I need to remove those lines. (SalesQuantity column is all numeric) My dataset is as follows.

 
CustomerAccount SalesDate  ProductNumber SalesQuantity
Cust00001       2014-01-06     PRD0001        2600
Cust00001       2014-01-06     PRD0002        200.5
Cust00001       2014-01-06     PRD0003        800
Cust00001       2014-01-06     PRD0004        882.5
Cust00002       2014-03-05     PRD0003        2400
Cust00002       2014-03-05     PRD0002        600

I was able to solve the problem by filtering by the remainder of the section, but I wonder if there is a function for the solution. Thanks.

float <- which(df$SalesQuantity %% 1 > 0) 
df[float, ]
df <- df[-float, ]
Esad
  • 61
  • 5
  • The answer is no, there is no built-in function to keep only integers. The function `is.integer` does something else, it checks the class attribute of its argument. Examples: `is.integer(1)` returns `FALSE` because `1` is of class `"numeric"` but `is.integer(1L)` returns `TRUE`. – Rui Barradas Feb 21 '22 at 12:44

1 Answers1

1

Here is a dplyr solution, which will remove any rows that contain dots ., therefore all values that are not "integer" will be removed.

library(dplyr)

df %>% filter(!grepl("\\.", SalesQuantity))

 CustomerAccount  SalesDate ProductNumber SalesQuantity
1       Cust00001 2014-01-06       PRD0001          2600
2       Cust00001 2014-01-06       PRD0003           800
3       Cust00002 2014-03-05       PRD0003          2400
4       Cust00002 2014-03-05       PRD0002           600
benson23
  • 16,369
  • 9
  • 19
  • 38