3

Is it possible in matlab/octave to use the sort function to sort an array based on the relative frequency of their elements?

For example the array

m= [4,4,4,10,10,10,4,4,5]

should result in this array:

[5,10,10,10,4,4,4,4,4]

5 is the less frequent element and is on the top while 4 is the most frequent and it's on bottom. Should one use the indices provided by histcount?

Robert Seifert
  • 25,078
  • 11
  • 68
  • 113
linello
  • 8,451
  • 18
  • 63
  • 109

3 Answers3

3

One way would be to use accumarray to find the count of each number (I suspect you can use histcounts(m,max(m))) but then you have to clear all the 0s).

m = [4,4,4,10,10,10,4,4,5];

[~,~,subs]=unique(m);
freq = accumarray(subs,subs,[],@numel);
[~,i2] = sort(freq(subs),'descend');

m(i2)

By combinging my approach with that of m.s. you can get a simpler solution:

m = [4,4,4,10,10,10,4,4,5];

[U,~,i1]=unique(m);
freq= histc(m,U);
[~,i2] = sort(freq(i1),'descend');

m(i2)
Community
  • 1
  • 1
Dan
  • 45,079
  • 17
  • 88
  • 157
3

The following code first calculates how often each element occurs and then uses runLengthDecode to expand the unique elements.

m = [4,4,4,10,10,10,4,4,5];

u_m = unique(m);

elem_count = histc(m,u_m);
[elem_count, idx] = sort(elem_count);

m_sorted = runLengthDecode(elem_count, u_m(idx));

The definition of runLengthDecode is copied from this answer:

For MATLAB R2015a+:

function V = runLengthDecode(runLengths, values)
if nargin<2
    values = 1:numel(runLengths);
end
V = repelem(values, runLengths);
end

For versions before R2015a:

function V = runLengthDecode(runLengths, values)
%// Actual computation using column vectors
V = cumsum(accumarray(cumsum([1; runLengths(:)]), 1));
V = V(1:end-1);
%// In case of second argument
if nargin>1
    V = reshape(values(V),[],1);
end
%// If original was a row vector, transpose
if size(runLengths,2)>1
    V = V.'; %'
end
end
Community
  • 1
  • 1
m.s.
  • 16,063
  • 7
  • 53
  • 88
  • You should post the actual code for `runlegnthDecode` in your answer in addition to the link you already have... also note `repelem` is quite a recent function so this won't work on older versions of Matlab (however it is not hard to make it work). Although by combining our two answers, you could avoid using that `runlengthDecode` by just using sort... – Dan Jul 28 '15 at 09:11
  • @Dan I copied the code from the original answer; it does not necessarily depend on `repelem`. – m.s. Jul 28 '15 at 09:14
2

You could count the number of repetitions with bsxfun, sort that, and apply that sorting to m:

[~, ind] = sort(sum(bsxfun(@eq,m,m.')));
result = m(ind);
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147