0

I have a cell array of the following type :

 datABC =
           [45]  [67]  'A'
           [34]  [44]  'A'
           [11]  [84]  'A'
           [23]  [68]  'A'
           [34]  [44]  'B'
           [30]  [94]  'B'
           [304]  [414]  'C'
           [78]  [110]  'C'
           [34]  [120]  'C'

Now i have to calculate the number of observations and mean of first and second column according to A, B & C.

Thanks in advance.

user2983722
  • 215
  • 1
  • 3
  • 11

3 Answers3

2

Seeing as the floodgates are opened, I might as well throw in my two cents.

A solution with a loop works well, but you can also eliminate loops, at the expense of readability. First, you can get the unique values in the last column with unique:

stringKeys = unique(datABC(:,3))'

Then you can use an anonymous function and cellfun to count the occurrences of each key:

memberFun = @(x) ismember(datABC(:,3),x);
keyOccurrences = cellfun(@(x) nnz(memberFun(x)),stringKeys)

To compute the mean of the corresponding data for each of the first two columns, you can again use cellfun with non-uniform outputs:

colMeanFun = @(x) mean(reshape([datABC{memberFun(x),1:2}],[],2),1);
colMeans = cellfun(colMeanFun,stringKeys,'UniformOutput',false);
colMeans = vertcat(colMeans{:})

Also have a look ate strcmpmi, which can be used in place of ismember but will ignore case.

Test data:

datABC = {[45]  [67]  'A'; [34]  [44]  'A'; [11]  [84]  'A'; ...
          [23]  [68]  'A'; [34]  [44]  'B'; [30]  [94]  'B'; ...
          [304] [414] 'C'; [78]  [110] 'C'; [34]  [120] 'C'}; % 9-by-3
chappjc
  • 30,359
  • 6
  • 75
  • 132
2

Looks like a job for accumarray:

[categories ii jj] = unique(dataABC(:,3));
num = histc(jj,1:max(jj));
mean1 = accumarray(jj, cell2mat(dataABC(:,1)), [], @mean);
mean2 = accumarray(jj, cell2mat(dataABC(:,2)), [], @mean);

Example:

>> dataABC{4,2}

dataABC = 

    [1]    [10]    'A' 
    [2]    [-5]    'B' 
    [3]    [15]    'A' 
    [4]    [40]    'CC'

gives

categories = 

    'A'
    'B'
    'CC'

>> num

num =

     2
     1
     1

>> mean1

mean1 =

     2
     2
     4

>> mean2

mean2 =

   12.5000
   -5.0000
   40.0000
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
1

Get comfortable with logical indexing.

for x = unique(datABC(:,3))'
    idx = strcmp(x, datABC(:, 3));
    disp([x{1} ': ' num2str(sum(idx)) ' observations'])
    disp(mean(cell2mat(datABC(idx, 1:2))))
end
Prashant Kumar
  • 20,069
  • 14
  • 47
  • 63