I have a dataset with about 3,000 rows. The data can be accessed via https://pastebin.com/i4dYCUQX
Problem: NA results in the output, though there appear to be no NA in the data. Here is what happens when I try to sum the total value in each category of a column via dplyr or aggregate:
example <- read.csv("https://pastebin.com/raw/i4dYCUQX", header=TRUE, sep=",")
example
# dplyr
example %>% group_by(size) %>% summarize_at(vars(volume), funs(sum))
Out:
# A tibble: 4 x 2
size volume
<fctr> <int>
1 Extra Large NA
2 Large NA
3 Medium 937581572
4 Small NA
# aggregate
aggregate(volume ~ size, data=example, FUN=sum)
Out:
size volume
1 Extra Large NA
2 Large NA
3 Medium 937581572
4 Small NA
When trying to access the value via colSums
, it seems to work:
# Colsums
small <- example %>% filter(size == "Small")
colSums(small["volume"], na.rm = FALSE, dims = 1)
Out:
volume
3869267348
Can anyone imagine what the issue could be?