6

I have two arrays:

OTPCORorder = [61,62,62,62,62,62,62,62,62,62,62,62,65,65,...]
AprefCOR = [1,3,1,1,1,1,1,1,1,1,2,3,3,2,...]

for each element in OTPCORorder there is a corresponding element in AprefCOR. I want to know the percent of the number 1 for each set of unique OTPCORorder as follows:

OTPCORorder1 = [61,62,65,...]
AprefCOR1 = [1,0.72,0,...]

I already have this:

[OTPCORorder1,~,idx] = unique(OTPCORorder,'stable');
ANS = OTPCORorder1 = [61,62,65,...];

and I used to work with "accumarray" but I used the "mean" or "sum" function such as this:

AprefCOR1 = accumarray(idx,AprefCOR,[],@mean).';

I was just wondering if there exists a way to use this but with "prctile" function or any other function that gives me the percent of a specific element for example "1" in this case.

Thank you very much.

chinkare_16
  • 135
  • 7

3 Answers3

5

This could be one approach:

%// make all those non-zero values to zero
AprefCORmask = AprefCOR == 1;

%// you have done this
[OTPCORorder1,~,idx] = unique(OTPCORorder,'stable');

%// Find number of each unique values
counts = accumarray(idx,1);

%// Find number of ones for each unique value
sumVal = accumarray(idx,AprefCORmask);

%// find percentage of ones to get the results
perc = sumVal./counts

Results:

Inputs:

OTPCORorder = [61,62,62,62,62,62,62,62,62,62,62,62,65,65];
AprefCOR = [1,3,1,1,1,1,1,1,1,1,2,3,3,2];

Output:

perc =

1.0000
0.7273
     0
rayryeng
  • 102,964
  • 22
  • 184
  • 193
Santhan Salai
  • 3,888
  • 19
  • 29
  • @chinkare_16 you are welcome. BTW, When you have time accept any one of our answers which ever you feel better satisfied your problem – Santhan Salai Jun 04 '15 at 06:07
  • @SanthanSalai - Because the OP wanted an approach via `accumarray`, the answer to accept should be yours as you did what the OP requested. I provided an alternative because `accumarray` in this case (at least in my opinion) would be slower. – rayryeng Jun 04 '15 at 06:09
  • 1
    @SanthanSalai and now you are a 3k user... Congrats :-) – kkuilla Jun 04 '15 at 14:24
  • @kkuilla LOL Thanks :) – Santhan Salai Jun 04 '15 at 14:30
  • 1
    I accidentally down voted. In order to change my vote, I had to edit your post. It won't let you change your vote once a certain grace period elapses. Forgot to upvote it too! – rayryeng Jun 05 '15 at 14:08
4

Here's another approach without using accumarray. I think it's more readable:

>> list = unique(PCORorder);
>> counts_master = histc(PCORorder, list);
>> counts = histc(PCORorder(AprefCOR == 1), list);
>> perc = counts ./ counts_master

perc =

    1.0000    0.7273         0

How the above code works is that we first find those elements in PCORorder that are unique. Once we do this, we first count up how many elements belong to each unique value in PCORorder via histc using the bins to count at as this exact list. If you're using a more newer version of MATLAB, use histcounts instead... same syntax. Once we find the total number of elements for each value in PCORorder, we simply count up how many elements correspond to PCORorder where AprefCOR == 1 and then to calculate the percentage, you simply divide each entry in this list with the total number of elements from the previous list.

It'll give you the same results as accumarray but with less overhead.

rayryeng
  • 102,964
  • 22
  • 184
  • 193
  • 1
    Without accumarray is good :) Our logic ended similar though :) +1 – Santhan Salai Jun 04 '15 at 06:02
  • Thanks :) `accumarray` is great but if you seriously just want to use it to tally up values in 1D, `histc` is actually faster. – rayryeng Jun 04 '15 at 06:08
  • Yeah i agree. :) Still learning from you guys to make things efficient :D – Santhan Salai Jun 04 '15 at 06:10
  • 2
    @SanthanSalai - heh :) TBH, last year I was in exactly the same shoes you were. I was starting to write answers here... and I started learning things from Luis Mendo, Divakar, Amro, etc.... I would usually write answers, then those guys would suggest something that made my code look terrible lol. I admire you for beginning to write answers and I'm starting to see the maturity in the way you're writing them... from answers that didn't use much of MATLAB's functions to using more of them and having answers with less code. You've made great progress. Keep it up! – rayryeng Jun 04 '15 at 06:14
  • 1
    Thats a great compliment hearing from you :) Full credits to you guys. Still a kid in matlab though (<4 months experience). Although i'm using my Vacation as much as possible learning MatLab here from you guys, sadly the vacation is about to get over. Will not be this active anymore. Thanks again for all the help :) – Santhan Salai Jun 04 '15 at 06:25
  • 1
    A while back I did some benchmarks for my own personal interest on this `@mean` finding against other alternatives, but never presented them here. I might do it soon though. – Divakar Jun 04 '15 at 06:49
  • 1
    @SanthanSalai I agree with rayryeng. You're making great progress! Keep it up! – Luis Mendo Jun 04 '15 at 10:42
  • @LuisMendo Thanks. That means a lot to me :) – Santhan Salai Jun 04 '15 at 10:46
  • 2
    @SanthanSalai So it's the holiday? I thought everyone was making vast progress because they were up at 4am, like rayryeng :-) – kkuilla Jun 04 '15 at 14:50
  • @kkuilla yes you are right. It was 3 months summer vacation for me and its about to get over :( you know his timezone? It may not be 4am for him :P – Santhan Salai Jun 04 '15 at 14:55
  • 2
    It was 4 am :P. I live in Toronto Canada. – rayryeng Jun 04 '15 at 15:23
3

Your approach works, you only need to define an appropriate anonymous function to be used by accumarray. Let value = 1 be the value whose percentage you want to compute. Then

[~, ~, u] = unique(OTPCORorder); %// labels for unique values in OTPCORorder
result = accumarray(u(:), AprefCOR(:), [], @(x) mean(x==value)).';

As an alternative, you can use sparse as follows. Generate a two-row matrix sucha that each column corresponds to one of the possible values in OTPCORorder. First row tallies how many times each value in OTPCORorder had the desired value in AprefCOR; second row tallies how many times it didn't.

[~, ~, u] = unique(OTPCORorder);
s = full(sparse((AprefCOR==value)+1, u, 1));
result = s(2,:)./sum(s,1);
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147