As a former SPSS user I was wondering if anybody knows of an equivalent to the 'examine' command in R? E.g.:
EXAMINE VARIABLES=income by sex
/CINTERVAL 95.
Cheers
As a former SPSS user I was wondering if anybody knows of an equivalent to the 'examine' command in R? E.g.:
EXAMINE VARIABLES=income by sex
/CINTERVAL 95.
Cheers
The closest I'm aware of is the describe
function found in the psych
package.
Here's some code you can run to see how it works:
install.packages('psych')
library(psych)
data('mtcars')
# Standard describe function
describe(mtcars$mpg)
# Show interquartile ranges
describe(mtcars$mpg, IQR = TRUE)
The output for describe(mtcars$mpg, IQR = TRUE)
:
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 32 20.09 6.03 19.2 19.7 5.41 10.4 33.9 23.5 0.61 -0.37 1.07 7.38
One can also handle a level of by group processing by adding split()
and lapply()
to Matt's answer. For example, to obtain descriptive statistics on mtcars$mpg
by number of cylinders, we do the following:
library(psych)
splitvar <- as.factor(mtcars$cyl)
data <- split(mtcars,splitvar)
lapply(data,function(x){describe(x$mpg,IQR=TRUE)})
...and the output:
> lapply(data,function(x){describe(x$mpg,IQR=TRUE)})
$`4`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 11 26.66 4.51 26 26.44 6.52 21.4 33.9 12.5 0.26 -1.65 1.36 7.6
$`6`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 7 19.74 1.45 19.7 19.74 1.93 17.8 21.4 3.6 -0.16 -1.91 0.55 2.35
$`8`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR
X1 1 14 15.1 2.56 15.2 15.15 1.56 10.4 19.2 8.8 -0.36 -0.57 0.68 1.85
>
We can also add quantiles via the quant=
argument. Here we'll generate the 5%ile and 95%ile values.
lapply(data,function(x){describe(x$mpg,quant=c(.05,.95),IQR=TRUE)})
...and the output:
> lapply(data,function(x){describe(x$mpg,quant=c(.05,.95),IQR=TRUE)})
$`4`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR Q0.05 Q0.95
1 1 11 26.66 4.51 26 26.44 6.52 21.4 33.9 12.5 0.26 -1.65 1.36 7.6 21.45 33.15
$`6`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR Q0.05 Q0.95
1 1 7 19.74 1.45 19.7 19.74 1.93 17.8 21.4 3.6 -0.16 -1.91 0.55 2.35 17.89 21.28
$`8`
vars n mean sd median trimmed mad min max range skew kurtosis se IQR Q0.05 Q0.95
1 1 14 15.1 2.56 15.2 15.15 1.56 10.4 19.2 8.8 -0.36 -0.57 0.68 1.85 10.4 18.88
>
...posting as Community wiki to avoid taking credit for Matt's answer.
It turns out that the psych
package has a separate function, describeBy()
that allows one to have multiple by group variables, which more closely emulates the behavior of the SPSS EXAMINE
procedure.
We'll demonstrate with the mtcars
data frame, using the cyl
and am
columns as by groups.
library(psych)
describeBy(mtcars,group = c("cyl","am"),quant=c(.05,.95))
...and the output for the first two by group combinations:
Descriptive statistics by group
cyl: 4
am: 0
vars n mean sd median trimmed mad min max range skew kurtosis se Q0.05
mpg 1 3 22.90 1.45 22.80 22.90 1.93 21.50 24.40 2.90 0.07 -2.33 0.84 21.63
cyl 2 3 4.00 0.00 4.00 4.00 0.00 4.00 4.00 0.00 NaN NaN 0.00 4.00
disp 3 3 135.87 13.97 140.80 135.87 8.75 120.10 146.70 26.60 -0.31 -2.33 8.07 122.17
hp 4 3 84.67 19.66 95.00 84.67 2.97 62.00 97.00 35.00 -0.38 -2.33 11.35 65.30
drat 5 3 3.77 0.13 3.70 3.77 0.01 3.69 3.92 0.23 0.38 -2.33 0.08 3.69
wt 6 3 2.94 0.41 3.15 2.94 0.06 2.46 3.19 0.73 -0.38 -2.33 0.24 2.53
qsec 7 3 20.97 1.67 20.01 20.97 0.01 20.00 22.90 2.90 0.38 -2.33 0.97 20.00
vs 8 3 1.00 0.00 1.00 1.00 0.00 1.00 1.00 0.00 NaN NaN 0.00 1.00
am 9 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 NaN NaN 0.00 0.00
gear 10 3 3.67 0.58 4.00 3.67 0.00 3.00 4.00 1.00 -0.38 -2.33 0.33 3.10
carb 11 3 1.67 0.58 2.00 1.67 0.00 1.00 2.00 1.00 -0.38 -2.33 0.33 1.10
Q0.95
mpg 24.24
cyl 4.00
disp 146.11
hp 96.80
drat 3.90
wt 3.19
qsec 22.61
vs 1.00
am 0.00
gear 4.00
carb 2.00
----------------------------------------------------------------------
cyl: 6
am: 0
vars n mean sd median trimmed mad min max range skew kurtosis se Q0.05
mpg 1 4 19.12 1.63 18.65 19.12 1.04 17.80 21.40 3.60 0.48 -1.91 0.82 17.85
cyl 2 4 6.00 0.00 6.00 6.00 0.00 6.00 6.00 0.00 NaN NaN 0.00 6.00
disp 3 4 204.55 44.74 196.30 204.55 42.55 167.60 258.00 90.40 0.17 -2.25 22.37 167.60
hp 4 4 115.25 9.18 116.50 115.25 9.64 105.00 123.00 18.00 -0.09 -2.33 4.59 105.75
drat 5 4 3.42 0.59 3.50 3.42 0.62 2.76 3.92 1.16 -0.09 -2.33 0.30 2.81
wt 6 4 3.39 0.12 3.44 3.39 0.01 3.21 3.46 0.25 -0.73 -1.70 0.06 3.25
qsec 7 4 19.21 0.82 19.17 19.21 0.85 18.30 20.22 1.92 0.11 -2.02 0.41 18.39
vs 8 4 1.00 0.00 1.00 1.00 0.00 1.00 1.00 0.00 NaN NaN 0.00 1.00
am 9 4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 NaN NaN 0.00 0.00
gear 10 4 3.50 0.58 3.50 3.50 0.74 3.00 4.00 1.00 0.00 -2.44 0.29 3.00
carb 11 4 2.50 1.73 2.50 2.50 2.22 1.00 4.00 3.00 0.00 -2.44 0.87 1.00
Q0.95
mpg 21.07
cyl 6.00
disp 253.05
hp 123.00
drat 3.92
wt 3.46
qsec 20.10
vs 1.00
am 0.00
gear 4.00
carb 4.00
----------------------------------------------------------------------