How to run the Kennard-Stone algorithm with a multilayer raster?

Question

I am dealing with a spatial dataset that I need to divide into a training and a validation subset. To be specific, I have a raster with 31 bands; I need to use all of them as parameters for the division of the dataset into the two subsets. I wish to use the Kennard-Stone algorithm for the division, so I have looked into the two existing functions that can be used in R.

The first is the ken.sto in the soil.spec package. The second is the duplex in the cran package. The problem is that both of them require a matrix or a dataframe as the input, while I have a multilayer raster that I can only convert into an array.

Does anyone have any suggestion on how to transform my spatial data, so that it can be used in one of the KS functions?

Welcome to stackoverflow. Your question is pretty broad. This site for asking question about specific coding issues such as errors, incorrect output, etc. You should also add R as a tag so people that know the language can find your post. This question might be a better fit on Computer Science. — curt, Jan 05 '17 at 17:42

maRtin · Accepted Answer · 2017-01-05T17:56:55.970

0

If ras is you stack, you can use as.data.frame(ras) to convert your multilayer raster into a data.frame. This will result in a two-dimensional data.frame with n columns (n = raster layers, in your case 31) and m rows (m = number of cells in your raster). Then you should be able to apply the soil.spec function which requires a data.frame as input.

Note: You will however lose the spatial information if you convert your raster to a data.frame. After you have applied your sampling you might want to export the result back as a raster. Here you can use the indices of the data.frame rows to get the values back into the initial raster grid.

edited Jan 05 '17 at 17:56

answered Jan 05 '17 at 17:49

maRtin

6,336
11
43
66

Hi @maRtin, thank you for your answer. I'm sorry if I'm replying only now, but I had to leave that work apart and I have started again only now. Your suggestion worked well, indeed I managed to create the dataframe I needed. However, I have now another question, which you anticipated...how do I take this info back to a raster? I have no clue! – Laura Paladini Jan 29 '17 at 12:11

score 0 · Answer 2 · answered Jan 10 '17 at 09:54

Thank you very much maRtin, you helped me find the right function (and sorry for the late reply). However, I've got another problem now; after converting the rasters into dataframes, I have tried running ken.sto again, and I get another error:

Error in prcomp.default(inp, scale = T) : 
  cannot rescale a constant/zero column to unit variance

Here is part of the summary of the dataframe I have used as input:

 evi_pks_10.1      evi_pks_10.2      evi_pks_10.3      evi_pks_10.4  

    evi_pks_10.5      evi_pks_10.6    
 Min.   :-999.0    Min.   :-999.0    Min.   :-999.0    Min.   :-999      Min.   :-999.0    Min.   :-999     
 1st Qu.:-999.0    1st Qu.:-999.0    1st Qu.:-999.0    1st Qu.:-999      1st Qu.:-999.0    1st Qu.:-999     
 Median :   1.0    Median :  52.0    Median : 116.0    Median :5677      Median : 148.0    Median :2556     
 Mean   :-269.1    Mean   :-189.9    Mean   :-141.7    Mean   :4159      Mean   :-119.6    Mean   :2196     
 3rd Qu.:   1.0    3rd Qu.: 155.0    3rd Qu.: 212.0    3rd Qu.:6744      3rd Qu.: 245.8    3rd Qu.:4073     
 Max.   :   2.0    Max.   : 360.0    Max.   : 360.0    Max.   :9649      Max.   : 299.0    Max.   :7215     
 NA's   :1555628   NA's   :1555628   NA's   :1555628   NA's   :1555628   NA's   :1555628   NA's   :1555628  
  evi_pks_10.7      evi_pks_10.8      evi_pks_10.9     evi_pks_10.10     evi_pks_10.11     evi_pks_10.12

So, apparently the problem is that I've got NAs?

Welcome to StackOverflow! If you have a new question, you should start a new topic or comment under my answer. I don't think that the NAs are the problem, but rather columns that have zero variance (all values are the same). Try to remove them and then it should work. Look here: http://stackoverflow.com/questions/15068981/removal-of-constant-columns-in-r and here: http://stackoverflow.com/questions/40315227/how-to-solve-prcomp-default-cannot-rescale-a-constant-zero-column-to-unit-var Btw. If you found my answer as useful, you can mark it as accepted or upvote it below. — maRtin, Jan 15 '17 at 19:41

How to run the Kennard-Stone algorithm with a multilayer raster?

2 Answers2