I have a dataframe of different kind of variables (numeric, character, factor) on the columns which I would liko to summarise at once. I have an ID column to be counted according to the levels of the other columns.
Every column has different levels if they are character or factor and I would like to know the frequency of the IDs for each level. In addition if the column is numeric I would like to have returned summary statistics such as mean, sd, and quantiles.
Ideally I would do this with dplyr
with group_by()
and summarise()
functions but it requires me to group each column at a time and then specify whether I want it counted with n()
or whether I want summary statistics because of being numeric.
In SAS
there is a command known as PROC FREQ
which I am trying to replicate.
df<-
data.frame(
ID = c(1,2,3,4,5,6),
Age = c(20, 30, 45, 60, 70, 18),
Car = c("Zum", "Yat", "Zum", "Zum", "Yat", "Rel"),
Side = c("Left", "Right", "Left", "Left", "Right", "Right")
)
Result:
df %>% group_by(Car) %>% summarise(n = n())
df %>% group_by(Side) %>% summarise(n = n())
df %>% summarise(mean = mean(Age))
I would like to obtain this result in a single output and for many variables. My real df contains tens of columns which should be either grouping variables or not depending on their nature. In addition the ID
could be even repeated with the same values for the observations to be summarised.