MATLAB find mean of column in matrix using two different indices

Question

I have a 22007x3 matrix with data in column 3 and two separate indices in columns 1 and 2.

eg.

I need to find the mean of the values in column 3 when the values in column 1 are the same AND the values in column 2 are the same, to end up with something like:

ans = 

    1   3   4.6667
    1   16  3.6667
    2   4   2
    2   11  2.3333

Please bear in mind that in my data, the number of times the values in column 1 and 2 occur can be different.

Two options I've tried already are the meshgrid/accumarray option, using two distinct unique functions and a 3D array:

[U, ix, iu] = unique(x(:, 1));
[U2,ix2,iu2] = unique(x(:,2));
[c, r, j] = meshgrid((1:size(x(:, 1), 2)), iu, iu2);
totals = accumarray([r(:), c(:), j(:)], x(:), [], @nanmean);

which gives me this:

??? Maximum variable size allowed by the program is exceeded.

Error in ==> meshgrid at 60
    xx = xx(ones(ny,1),:,ones(nz,1));

and the loop option,

for i=1:size(x,1)
    if x(i,2)== x(i+1,2);
        totals(i,:)=accumarray(x(:,1),x(:,3),[],@nanmean);
    end
end

which is obviously so very, very wrong, not least because of the x(i+1,2) bit.

I'm also considering creating separate matrices depending on how many times a value in column 1 occurs, but that would be long and inefficient, so I'm loathe to go down that road.

score 5 · Accepted Answer · answered Apr 19 '13 at 15:26

5

Group on the first two columns with a unique(...,'rows'), then accumulate only the third column (always the best approach to accumulate only where accumulation really happens, thus avoiding indices, i.e. the first two columns, which you can reattach with unX):

[unX,~,subs] = unique(x(:,1:2),'rows');
out          = [unX accumarray(subs,x(:,3),[],@nanmean)];

out =
            1            3       4.6667
            1           16       3.6667
            2            4            2
            2           11       2.33

answered Apr 19 '13 at 15:26

Oleg

10,406
3
29
57

2

This should be compatible with Eitan's manswer to your previous question as well (i.e. when you want to mix the two): http://stackoverflow.com/questions/16086874/matlab-find-and-apply-function-to-values-of-repeated-indices/16087295#16087295 – Dan Apr 19 '13 at 15:30
That's the one! Thanks Oleg! – 8eastFromThe3ast Apr 23 '13 at 12:45

Floris · Answer 2 · 2013-04-19T18:16:35.063

This is an ideal opportunity to use sparse matrix math.

x = [ 1 2 5;
      1 2 7;
      2 4 6;
      3 4 6;
      1 4 8;
      2 4 8;
      1 1 10]; % for example

SM = sparse(x(:,1),x(:,2), x(:,3); 
disp(SM)

Result:

(1,1)   10
(1,2)   12
(1,4)    8
(2,4)   14
(3,6)    7

As you can see, we did the "accumulate same indices into same container" in one fell swoop. Now you need to know how many elements you have:

NE = sparse(x(:,1), x(:,2), ones(size(x(:,1))));
disp(NE);

Result:

(1,1)   1
(1,2)   2
(1,4)   1
(2,4)   2
(3,6)   1

Finally, you divide one by the other to get the mean (only use elements that have a value):

matrixMean = SM;
nz = find(NE>0);
matrixMean(nz) = SM(nz) ./ NE(nz);

If you then disp(matrixMean), you get

(1,1)    10
(1,2)     6
(1,4)     8
(2,4)     7
(3,6)     7

If you want to access the individual elements differently, then after you have computed SM and NE you can do

[i j n] = find(NE);
matrixMean = SM(i,j)./NE(i,j);
disp([i(:) j(:) nonzeros(matrixMean)]);

MATLAB find mean of column in matrix using two different indices

2 Answers2