I am trying to import GeoLife GPS trajectory dataset in my workspace. The folder that includes this data has 182 sub-folders for each tracked user and each sub-folder has user specific trajectory files in .plt format. The number of trajectory files is not fixed, it may be different for each user. Also, columns do not contain just one data type. The type of first 5 column is float and the type of last 2 ones is string (date and time). My objective is to store this data in an array with size 182 in which each array slot includes user specific trajectories. To do this, I used getAllFiles function given here. Then, I visited all files that are returned from this function and stored them as follows:
fileList = getAllFiles('C:\...\MATLAB\HmMDTW\Data');
i = 1;
k = 1;
trj = [];
trjAll = cell(182,1);
pid = '000';
while i < length(fileList)
fileDir = cell2mat(fileList(i));
index = strfind(fileDir, 'Trajectory') + 11;
if any(index)
fid = fopen(fileDir);
t = textscan(fid, '%f %f %f %f %f %s %s', 'Delimiter', ',', 'HeaderLines', 6, 'CollectOutput', 1);
cid = fileDir(45:47);
if ~strcmp(cid, pid)
trjAll{k} = trj;
trj = [];
k = k + 1
end
pid = cid;
trj = [trj;t];
fclose(fid);
end
i = i + 1;
end
Above, I just checked all files and if this file is a trajectory, I read the relevant data in that file and add it in trj (trajectory list for the current user). If user id (000, ..., 181) changes in the next file I added trj in trjAll (all user trajectory arrays) and initialized trj. Hence, trjAll contained all trj-s. However, it took long time (about 4-5 minutes). Are there any more efficient way to achieve what I want to do? I think that I may read files in getAllFiles function but I do not think so it would save significant amount of time. Thank you in advance.