-1

I have a data frame in long format showing repeated measures of height on a group of individuals.

The mean number of observation is calculated 2000/500 = 4 observations per child.

How can I calculate the median and interquartile range for the number of observations per child?

data <- data.frame(

child_id = 1:500,
height_1 = rnorm(500, mean = 80, sd = 2),
height_2 = rnorm(500, mean = 90, sd = 2),
height_3 = rnorm(500, mean = 100, sd = 2),
height_4 = rnorm(500, mean = 115, sd = 2)
)

data_long <- reshape(data, varying=c(
"height_1", "height_2", "height_3", "height_4"),
direction= "long", idvar="child_id", timevar = "time", sep="_"
)

# Mean observation per child = 2000/500 = 4
data_long$id_f <- as.factor(data_long$child_id)
length(unique(data_long$id_f)) # 500 children

length(data_long$height) # 2000 observations
aelhak
  • 441
  • 4
  • 14

1 Answers1

-1

We can use dplyr. Grouped by 'child_id', get the median and IQR of the 'height' column

library(dplyr)
data_long %>% 
   group_by(child_id) %>%
   summarise(median = median(height),
            interQuartileRange = IQR(height))

If we want the median and IQR based on the numebr of observations

data_long %>%
  count(child_id) %>% 
  summarise(median = median(n), IQR = IQR(n))
Jaap
  • 81,064
  • 34
  • 182
  • 193
akrun
  • 874,273
  • 37
  • 540
  • 662