0

I have an nXm matrix: 500x31. I want to remove 20% of elements from the first column, 15% from the second column, 10% from the third column. It doesn't matter which elements are removed but it would be nice to know how to remove random elements. Any suggestions?

Isaac
  • 215
  • 2
  • 15
  • In the end it will not be a matrix right? – Onyambu Feb 21 '18 at 02:04
  • And a tip: it is good practice/etiquette to include a code snippet illustrating what you've tried (or even just a description of the strategy). Even better if you include some thoughts on why it didn't work the way you wanted! And double bonus for using `dput()` to include a sample of data (when relevant). Some examples [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) :) – lefft Feb 21 '18 at 02:21

1 Answers1

3

Here's one way to do it, where you can specify the proportion that should be converted to NA (which is what I understand your goal to be). Note that I am interpreting "remove" in the question to mean "replace with NA" (if you literally removed them, the result would no longer be a matrix).

# create some fake data 
n_rows <- 500
n_cols <- 31
dat <- matrix(data=rnorm(n_rows*n_cols), nrow=n_rows, ncol=n_cols)

# write a function to replace `prop_na` proportion of a column's values with NA
values_to_na <- function(x, prop_na){
  n_na <- prop_na * length(x)
  # use `sample()` to select `n_na` random elements
  x[sample(1:length(x), size=n_na)] <- NA  
  return(x)
}

# then apply the func to each column, with desired proportion of NA's
dat[, 1] <- values_to_na(dat[, 1], prop_na=.2)
dat[, 2] <- values_to_na(dat[, 2], prop_na=.15)
# and so on...

And if you want to compute on just the non-NA elements of a column (say, column 1), then you can refer to them with dat[!is.na(dat[, 1]), 1]

lefft
  • 2,065
  • 13
  • 20