0

I have a question about the cell size by time point for my longitudinal data set to estimate smooth curves by each category of the X variable in GAM.

Basically, the data is from a cohort study which collects 5 waves of data. For my growth-curve mixed model, I will be using age instead of wave to estimate the age-specific Y trajectories. My predictor is a 4-category variable. My goal is to estimate the trajectories by each of the 4-category X variable.

However, when I look at the frequency distribution of the cell size by age and X variable, some cells are really small. In this case, will the estimation of the 4-type trajectories by age (in certain age categories) be unstable?

My age variables ranges from 12 to 40. Should I recode the age variable into age categories to enlarge the cell size by age (see below the recode) and 4-category x variable:

recode ageintw (12/14=1) (15/17=2) (18/20=3) (21/23=4) (24/27=4) (28/32=5) (33/35=6) (36/38=7) (39/40=8)

But the problem of using age category is that my estimated trajectories will change by collapsed age categories instead of continuous age.

Does any expert have some advice to share?

Thanks,

Pauline

P.S. age distribution by each category of X:

Age frequency distribution by values of X: 1

Age frequency distribution by values of X: 2

Age frequency distribution by values of X: 3

Age frequency distribution by values of X: 4

Note: I tried to collapse the age into age categories and then estimate the smooth curves. But I prefer to use continuous age.

  • Please provide enough code so others can better understand or reproduce the problem. – Community Mar 21 '23 at 23:05
  • Welcome to Stack Overflow. Please take the tour: https://stackoverflow.com/tour Please provide information per the guidelines in the following link by editing your question to add the info: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – John Polo Mar 22 '23 at 00:25

0 Answers0