How to pass a data in a pipe to colSums

Question

I would like to use %>% to pass a data through colSums. In fact, this should apply to all the calculations.

Here is my example:

I can use following codes to reach my goal:

result<- colSums(!is.na(df[ , c("A", "B", "C","D", "RT", "PR", "OTH")]), na.rm = TRUE)

how can I rewrite my codes to sth that looks like this:

result <- df[ , c("A", "B", "C","D", "RT", "PR", "OTH")] %>%
colSums(!is.na(), na.rm = TRUE)

These codes did not work. And I got error codes Error in is.na() : 0 arguments passed to 'is.na' which requires 1. Could anyone give me some guidance?

Thanks

Update:

Sample data:

df<-structure(list(A = c("A", NA, NA, NA, NA, NA, NA, NA), B = c(NA, 
NA, "B", NA, NA, NA, NA, NA), C = c(NA, "C", NA, NA, NA, NA, 
NA, NA), D = c(NA, NA, NA, "D", "D", NA, NA, NA), RT = c(NA, 
"RT", NA, NA, NA, NA, "RT", NA), PR = c(NA, NA, "PR", NA, NA, 
NA, NA, NA), OTH = c(NA, NA, NA, NA, "OTH", NA, NA, "OTH")), row.names = c(NA, 
-8L), class = c("tbl_df", "tbl", "data.frame"))

You already have an answer, but for more general cases, these posts may be useful: [Using the %>% pipe, and dot (.) notation](https://stackoverflow.com/questions/42385010/using-the-pipe-and-dot-notation), [What does the dplyr period character “.” reference?](https://stackoverflow.com/questions/35272457/what-does-the-dplyr-period-character-reference) — Henrik, Mar 31 '21 at 15:01

Gregor Thomas · Accepted Answer · 2021-03-31T17:34:39.003

What the pipe does is put what comes before the pipe as the first argument of what comes after, so

# What the pipe does
## with pipe
x %>% foo(other_arg)
## equivalent to this:
foo(x, other_arg)

## your version piped:
df[ , c("A", "B", "C","D", "RT", "PR", "OTH")] %>%
  colSums(!is.na(), na.rm = TRUE)

## is interpreted like this:
colSums(df[ , c("A", "B", "C","D", "RT", "PR", "OTH")], !is.na(), na.rm = TRUE)

Hopefully the above makes sense, and you can see why you get an error about is.na() needing an argument.

You can use the pipe, but as you note the ! takes special handling. ! as a prefix has higher precedence than %>%, so R will try to evaluate the ! result before piping into it. To work around this, we can call ! explicitly as a function, rather than a prefix operator. Alternately, if you load the magrittr package (the original source of %>%), it provides aliases for cases like this, including the not() function which is an alias for !. These are demonstrated below:

df[ , c("A", "B", "C","D", "RT", "PR", "OTH")] %>%
  is.na() %>%
  `!`() %>%
  colSums(na.rm = TRUE)

library(magrittr)
df[ , c("A", "B", "C","D", "RT", "PR", "OTH")] %>%
  is.na() %>%
  not() %>%
  colSums(na.rm = TRUE)

The last one works. I thought `colSums(!is.na(.), na.rm = TRUE)` should work. however it did not. — Stataq, Mar 31 '21 at 17:25

Anoushiravan R · Answer 2 · 2021-03-31T17:22:46.690

1

I have updated my code in the following way. I don't know why when I negate is.na I just can't get the desired result with a pipe

colSums(!is.na(df[ , c("A", "B", "C","D", "RT", "PR", "OTH")]))

  A   B   C   D  RT  PR OTH 
  1   1   1   2   2   1   2

Only in this way you can count those values which are not NA. If you want to stick to base R.

edited Mar 31 '21 at 17:22

answered Mar 31 '21 at 14:55

Anoushiravan R

21,622
3
18
41

Just bear in mind that when you pass a data into another function, the first argument of that function should be a data frame or a vector. – Anoushiravan R Mar 31 '21 at 14:56
Thanks for the answer. The reason I used `!is.na()` is count the number of cases with answer for `a,b,c ,d, etc`. They are character variables instead of numbers. Any way I can keep `!is.na()`? – Stataq Mar 31 '21 at 16:15
Your welcome. Not in the `colSums` but you can use it in your result vector. – Anoushiravan R Mar 31 '21 at 16:18
I just updated the post with sample data. Thanks. – Stataq Mar 31 '21 at 16:21
I updated my code. It's a kind of weird case with `is.na` we can use pipe but if I negate it the result would not be the one you like. – Anoushiravan R Mar 31 '21 at 16:41

score 0 · Answer 3 · answered Mar 31 '21 at 16:45

0

dplyr style would be

result <- df[ , c("A", "B", "C","D", "RT", "PR", "OTH")] %>% mutate(across(everything(), ~colSums(!is.na(.), na.rm = TRUE)))

answered Mar 31 '21 at 16:45

AnilGoyal

25,297
4
27
45

I got error codes` Error: Problem with `mutate()` input `..1`. x 'x' must be an array of at least two dimensions i Input `..1` is `across(everything(), ~colSums(!is.na(.), na.rm = TRUE))`.` – Stataq Mar 31 '21 at 17:09

How to pass a data in a pipe to colSums

3 Answers3