-1

I need to replace 5% of the values in my matrix with the number 0.2 (guess in my code) IF they are below 0.2, leave them alone if they are above 0.2

Right now my code is changing all values less than 0.2 to 0.2.

This will be in a larger loop eventually to occur over multiple replications, but right now I am trying to get it to work for just 1.

Explanation: gen.probs.2PLM is a matrix containing probabilities. Guess is the value I have chosen to replace others. Perc is the percentage I would like to look at in the matrix and change IF it is less than guess.

gen.probs.2PLM <- icc.2pl(gen.theta,a,b,N,TL)

perc<-0.05*N

guess<-0.2

gen.probs.2PLM[gen.probs.2PLM < guess] <- guess

I expect only 5 percent of the values to be looked at and changed to 0.2 if they are below 0.2

gen.probs.2PLM is a matrix that is 1000*45

# dput(gen.probs.2PLM[1:20, 1:5])
structure(c(0.940298707380962, 0.848432615784556, 0.927423909103331, 
0.850853479678874, 0.857217846940203, 0.437981231531586, 0.876146933879543, 
0.735970164547576, 0.76296469377238, 0.640645338681073, 0.980212105400924, 
0.45164925578322, 0.890102475061895, 0.593094353657132, 0.837401449711248, 
0.867436194744775, 0.753637051722629, 0.64254277457268, 0.947783594375454, 
0.956791049998361, 0.966059152820211, 0.896715435704569, 0.957247808046098, 
0.898712615329071, 0.903924224222216, 0.474561641407715, 0.919080521405463, 
0.795919510255144, 0.821437921281395, 0.700141602452725, 0.990657455188518, 
0.490423165094245, 0.92990761183835, 0.649494291971471, 0.887513826127176, 
0.912171225584296, 0.812707696992244, 0.702126169775785, 0.971012049724468, 
0.976789027046465, 0.905046450670641, 0.81322870291296, 0.890539069545935, 
0.81539882951241, 0.821148949083641, 0.494459368656066, 0.838675666691869, 
0.719720365120414, 0.741166345529595, 0.646700411799437, 0.9578080044146, 
0.504938867664858, 0.852068230044858, 0.611124165649146, 0.803451686558428, 
0.830526582119632, 0.73370297276145, 0.648126933954648, 0.913887754151632, 
0.925022099584059, 0.875712266966582, 0.762677615526032, 0.857390771477182, 
0.765270669721981, 0.772159371696644, 0.418524844618452, 0.793318641931831, 
0.65437308255825, 0.678633290218262, 0.574232080921638, 0.943851827968259, 
0.428780249640693, 0.809653131485398, 0.536512513508941, 0.751041035436293, 
0.783450103818893, 0.6701523432789, 0.575762279897951, 0.886965071394186, 
0.901230746880145, 0.868181123535613, 0.688344765218149, 0.840795870494126, 
0.69262216320168, 0.703982665712434, 0.215843106547112, 0.738775789107177, 
0.513997187757334, 0.551803060188986, 0.397460216626274, 0.956693337996693, 
0.225901690507801, 0.765409027208693, 0.347791079152411, 0.669156131912199, 
0.72257632593578, 0.538474414984722, 0.399549159711904, 0.884405290470079, 
0.904200878248468), .Dim = c(20L, 5L))
lefft
  • 2,065
  • 13
  • 20
Laura
  • 11
  • 2
  • Please provide a sample of `gen.probs.2PLM`. – NelsonGon Apr 19 '19 at 19:13
  • 2
    Welcome to stackoverflow! By following this guide [how to ask](https://stackoverflow.com/help/how-to-ask) you provide a solid basis for answering this question. In particular, it's mostly necessary to provide an [verifiable example](https://stackoverflow.com/help/mcve) (a minimum, complete, and verifiable example). Check out [this page](https://stackoverflow.com/a/5963610/4573108) for tips regarding R-specific MCVEs. Good look and thank you for contributing! – mischva11 Apr 19 '19 at 19:14
  • I added image of gen.probs.2PLM to question... it is a large matrix. – Laura Apr 19 '19 at 19:28
  • `5%` of what? Of all matrix entries or of the matrix entries below `0.`? – Rui Barradas Apr 19 '19 at 19:29

1 Answers1

1

Here is a function that you can apply to a numeric matrix to replace 5% of the values below some threshold (e.g. .2 in your case) with the threshold:

replace_5pct <- function(d, threshold=.2){
  # get indices of cells below threshold, sample 5% of them 
  cells_below <- which(d < threshold)
  cells_to_modify <- sample(cells_below, size=.05*length(cells_below))
  # then replace values for sampled indices with threshold + return 
  d[cells_to_modify] <- threshold
  return(d)
}

Here's an example of how it can be used (where dat would correspond to your matrix):

dat <- matrix(round(runif(1000), 1), ncol=10)
dat_5pct_replaced <- replace_5pct(dat, threshold=.2)

You can look at the data to confirm the result, or look at stats like these:

mean(dat < .2)                # somewhere between .1 and .2 probably 
sum(dat != dat_5pct_replaced) # about 5% of mean(dat < .2)

p.s.: If you want to generalize the function, you could abstract over the 5% replacement too -- then you could replace e.g. 10% of values below some threshold, etc. And if you wanna get fancy you could abstract over "less than" too, and add a comparison function as a parameter to the main function.

replace_func <- function(d, func, threshold, prop){
    cells <- which(func(d, threshold))
    cells_to_modify <- sample(cells, size=prop*length(cells))
    d[cells_to_modify] <- threshold
    return(d)
}

And then e.g. replace 10% of values above .5 with .5:

# (need to backtick infix functions like <, >, etc.) 
replace_func(dat, func=`>`, threshold=.5, prop=.1)
lefft
  • 2,065
  • 13
  • 20
  • Sorry, that might not be doing what I want it to. If I'm understanding it correctly, it's taking 5% of all the cells below 0.2 and changing them.. I need it to take 5% of all the data and only change the ones that are below 0.2 from that. – Laura Apr 19 '19 at 19:44
  • yes, that's how i understood the question to be stated. also look at the bottom of the updated answer, which generalizes the solution in a way that might be useful. – lefft Apr 19 '19 at 19:46