I have a large dataset with more than 250,000 observations, and I would like to use the TraMineR
package for my analysis. In particular, I would like to use the commands seqtree
and seqdist
, which works fine when I for example use a subsample of 10,000 observations. The limit my computer can manage is around 20,000 observations.
I would like to use all the observations and I do have access to a supercomputer who should be able to do just that. However, this doesn't help much as the process runs on a single core only. Therefore my question, is it possible to apply parallel computing technics to the above mentioned commands? Or are there other ways to speed up the process? Any help would be appreciated!