I am currently working on airmass trajectories for 11 different stations all over the city for one year. For each station I have dataframes of 72-hour trajectories that looks like this
date lon/lat
yymmddhh_1 lon_1
yymmddhh_1 lat_1
yymmddhh_1 lon_2
yymmddhh_1 lat_2
yymmddhh_1 lon_3
yymmddhh_1 lat_3
I didn't put the longitude and latitude values in separate columns because I need them to be in one for my analysis.
The date column starts with a certain day (in my case 011022: 22/10/2001) and goes backwards for 72 hours in 1-hour steps, leaving me with 146 separate lon/lat values. I have trajectories for 329 days, so the dimension of the dataframe is dim=48180 x 2.
Now I need a new dataframe where the columns are my backward timesteps (t-0, t-1, t-2,...,t-72) and each row represents one trajectory (yymmddhh_1,yymmddhh_2,...,yymmddhh_329).
date t-0 t-0 t-1 t-1
yymmddhh_1 lon_1 lat_1 lon_2 lat_2
yymmddhh_2 lon_1 lat_1 lon_2 lat_2
yymmddhh_3 lon_1 lat_1 lon_2 lat_2
So I think my code needs to read column 2 of my current dataframe up to row=146, write these values in the first row of my new dataframe, and repeat the process until the end of the dataframe is reached.
I already managed to do that for the first 146 values, which is rather easy because I just need to
trajectory_1 <- t(station.trajectory[1:146,2])
I also already created the date column.
Maybe I can use read.table
? I really have no idea where to start with this, so any help would be highly appreciated.
EDIT: To clear things up, here's an example of what the current dataframe looks like, and what the new one should look like:
[,1]
is the date (format: YYMMDDHH), [,2]
are the lon, lat values
[,1] [,2]
[1,] 2071000 525500
[2,] 2071000 133300
[3,] 2070923 524918
[4,] 2070923 134759
[5,] 2070922 524238
[6,] 2070922 136058
...
[146,] 2070700 140147
[147,] 2071100 525500
[148,] 2071100 133300
[149,] 2071023 525142
[150,] 2071023 128926
Note that at [147,]
a new trajectory for the day following [1,]
begins.
Keeping the content of[,1]
is not important here, what my code should to in the end, is take [,2]
and make it look like this :
[,1] [,2] [,3] [,4] [,5]
[1,] 2071000 525500 133300 524918 134759
[2,] 2071100 ... ... ... ...
EDIT 2: I also should add that I am trying to prepare my data for the k mean clustering (http://stat.ethz.ch/R-manual/R-devel/library/stats/html/kmeans.html). Maybe I am not understanding the manual properly, but to me it looks like each trajectory should have its own row...
EDIT 3:
I tried writing a loop to do the work.
ind1<- matrix()
ind1 <- cbind(seq(0,48034,146))
ind1[1,] <- 1
First I created an index to have steps of 146. My final dataframe shall be named beusselstr.dataframe
beusselstr.dataframe <- NULL
k<- NULL
The station "beusselstr" only has 115 days, so I want to use only the first 115 index values until 16790:
for (j in 1:115){
k[j] <- ind1[j+1]
beusselstr.dataframe[j] <- cbind(beusselstr.dataframe[j],t(beusselstr.trajectories[ind1[j]:k[j],2]))
}
However I receive the error "number of items to replace is not a multiple of replacement length"
.