Someone has sent me a .txt file that has no header. Also, there is no structure, so all rows follow the previous one on the same line of the file.
The only thing I know is that every 16 items (separated by “,”) there should be a single row in the final output because there are 16 variables or columns for each plot. Each line of the raw file contains all the measurements for the 16 variables for 438 different plots for one day. In total, the raw file contains 4015 lines (days), with 438x16 items in each line I assume (there might be NAs).
I have managed to read the file as:
x <- readLines("Data/meteodata.dat")
x <- as.data.frame(matrix(as.numeric(unlist(strsplit(x, ","))), ncol = 16, byrow = TRUE))
However, I now would need to aggregate the variables grouping by plot, so instead of such a huge dataset I end up with only 438 rows and 16 columns containing the mean values for each variable and plot. The problem is that the columns do not contain an identifier for each plot to group by. The key is that this messy dataset was generated by someone else from a dataset (y) with 438 rows (one per plot) that does contain plot labels, and in the same order:
> nrow(y)
[1] 438
> head(y)
CODE_PLOT CODE_COUNTRY
1 1 1
2 1 12
3 1 14
4 1 15
5 1 5
6 1 50
Hence, EACH LINE OF THE MESSY FILE CORRESPONDS WITH THE CODE_COUNTRY AND CODE_PLOT OF “Y” IN THE SAME ORDER.
Thanks and sorry for such an abstract message.
Example of the file before being able to read it:
48.25,4.25,1.989e+07,2.6,5.89,1.28,0.02,0,0,0.42,3575,0,-0.4,2.6,2.57,6.48,50,6,1.989e+07,3.55,5.42,2.31,0.42,0,0,0.15,2420,0,0.27,3.55,2,7.8
Example of the dataset after being able to read it:
> head(test)
lat long date temp.mean temp.max temp.min precip E0 ES0 ET0 radiation snow.depth
1 48.25 4.25 19890000 2.60 5.89 1.28 0.02 0.00 0.00 0.42 3575 0.00
2 50 6 19890000 3.55 5.42 2.31 0.42 0.00 0.00 0.15 2420 0.00
3 47.75 16.25 19890000 0.67 3.98 -0.92 0.63 0.08 0.00 0.53 5061 0.02
4 69.5 29 19890000 -13.63 -10.06 -20.20 0.10 0.00 0.00 0.02 70 16.56
5 41.75 13.5 19890000 2.05 8.79 -1.72 0.00 0.20 0.06 0.54 8206 0.10
6 47 8.75 19890000 -4.29 2.62 -7.97 0.00 0.00 0.00 0.21 7403 5.45
water.balance temp.mean2 wind P_hPa
1 -0.40 2.60 2.57 6.48
2 0.27 3.55 2.00 7.80
3 0.10 0.67 3.63 5.17
4 0.08 -13.63 3.65 1.78
5 -0.54 2.05 1.58 6.18
6 -0.21 -4.29 1.22 2.87