1

I am trying to take the mean of select values in a matrix column, selecting them by their value in a different column.

For example:

X=[1950     1;
   1950     2;
   1950     3;
   1951     1;
   1951     5;
   1952     1]

I want to take the mean of the values per each year - essentially, select values with the same column 1 value, and then take the mean of the corresponding column 2 values. So the mean value for 1950 would be 2, the mean value for 1951 would be 1. I can do this manually by creating a vector for each year value and then taking the mean of the whole vector, but this is impractical for greater amounts of data. The number of data points for each year varies, so I don't think I can use reshape to do this.

Dan
  • 45,079
  • 17
  • 88
  • 157
user3498384
  • 55
  • 1
  • 7
  • See also this [similar question](http://stackoverflow.com/questions/19882413/how-to-deal-with-paired-values) and a related question on [weighted means](http://stackoverflow.com/questions/22792020/matlab-accumarray-weighted-mean/22794702). – chappjc Apr 04 '14 at 18:12

2 Answers2

6

You want accumarray:

[~, ~, ii ] = unique(X(:,1));
result = accumarray(ii, X(:,2), [], @mean);

I suggest you read the documentation of accumarray thoroughly to see how this works. It's a very powerful and flexible function.

Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
  • Thank you, I'd never heard of that. I will look into that function for future use. – user3498384 Apr 04 '14 at 15:54
  • @user3498384 Agreed that this is the right way to go for your problem. +1 With `unique` you can even operate on non-numeric classes. BTW, you can also do a [weighted mean](http://stackoverflow.com/a/22794702/2778484) by creating a custom function. – chappjc Apr 04 '14 at 18:09
1

Give this a try:

X=[1950 1; 1950 2; 1950 3; 1951 1; 1951 5; 1952 1];

years = unique(X(:,1));

for ii=1:length(years)
    yr_index = find(X == years(ii));
    yr_avg(ii) = mean(X(yr_index,2));
end

This will find all the unique year entries. It will then step through each year, find the rows which correspond to the specific year, and take a mean of the second column of just those rows. It will save the mean in the yr_avg vector. Each year entry in the years vector should have a corresponding mean in the yr_avg vector.

ewz
  • 423
  • 2
  • 5
  • Thank you! This worked, and I appreciate your clear explanation. I had to add in a preallocated array of zeros, using noy=numel(unique(y)); yr_avg=zeros(noy,1); and then everything worked! – user3498384 Apr 04 '14 at 15:52