1

I have about 100,000 numbers that I would like to group together based on dividing by two or increments of two.
PS: The increment values may change and the values found in the main array "x" can only be used once.
I'm not sure how to check and stop the loop if a number in the "array_all" array has been repeated from the "x" array.

See example below

Example:

x=[9,8,7,6,5,4,3,2,1]

I'm trying to get the array_all array to look like this:

array_all= 
[  9.00000   4.50000   2.25000
   8.00000   4.00000   2.00000
   7.00000   3.50000   1.75000
   6.00000   3.00000   1.50000
   5.00000   2.50000   1.25000
   1.00000   0.50000   0.25000]

and the dynamically named arrays to look like this

array_dyn_name1=[9,4.5,2.25]
array_dyn_name2=[8,4,2]
array_dyn_name3=[7,3.5,1.75]
array_dyn_name4=[6,3,1.5]
array_dyn_name5=[5,2.5,1.25]
array_dyn_name6=[1,.5,.25]

PS: The reason I don't just stop at the array "array_dyn_name6" is due to that the numbers will not be that simple there will be thousands. And I won't know when they will repeat.

Sequence of events:
1. Start from the highest number in the x array and divide that number and it's output number by two and place that into another array called array_all, do this 3 times
2. Place each row of array_all into a dynamically named array called array_dyn_name
3. Do this for each value in the x array unless the number has already been used before in the array_all array

Note: As you can see the array_all array don't start with 4,3, or 2 because they were previously used in the array_all array

The code below works but I'm not sure how to check and stop the loop if a number in the "array_all" array has been repeated from the "x" array.

%test grouping
clear all, clc, tic, clf; 

x=[9,8,7,6,5,4,3,2,1]
div=[1,2,4] %numbers to use as divisor
array_all=[];
for ii=1:length(x)

    for jj=1:length(div)
        array_all(ii,jj)=x(ii)/div(jj) %divide each number and successive number by div 
    end

    genvarname('array_dyn_name',  num2str(ii)) %create dynamic variable
    eval(['array_dyn_name' num2str(ii) '= array_all(ii,:)']) %places row into dynamic variable 

end
fprintf('\nfinally Done-elapsed time -%4.4fsec- or -%4.4fmins- or -%4.4fhours-\n',toc,toc/60,toc/3600);

My output is below:

array_all =

   9.00000   4.50000   2.25000
   8.00000   4.00000   2.00000
   7.00000   3.50000   1.75000
   6.00000   3.00000   1.50000
   5.00000   2.50000   1.25000
   4.00000   2.00000   1.00000
   3.00000   1.50000   0.75000
   2.00000   1.00000   0.50000
   1.00000   0.50000   0.25000

PS: I'm using octave 3.8.1 it's like matlab

Shai
  • 111,146
  • 38
  • 238
  • 371
Rick T
  • 3,349
  • 10
  • 54
  • 119
  • 1
    in your example, I can't see `1` appearing in `array_all`? So, why don't you include `1` in your example? i.e., `array_dyn_name6 = [1, .5, .25]`? – Shai Feb 01 '15 at 14:53
  • @Shai thanks for the catch your correct it should be included, i added it – Rick T Feb 01 '15 at 15:49

1 Answers1

2

You can try

array_all = bsxfun( @times, x(:), [1 .5 .25] ); %// generate for all values

Now, using ismember we prune the rows of array_all

[Lia Locb] = ismember( x(:), reshape( array_all.', [], 1 ) ); %// order elements of array_all row by row

By construction all elements of x are in array_all but we want to prune rows for which the first element already appears in previous rows.

firstRowToAppearIn = ceil(Locb/3); 

So

toBePruned = 1:numel(x) > firstRowToAppearIn; %// prune elements that appear in array_all in a row preceding their location in x

array_all(toBePruned,:) = []; %// remove those lines

Now we can define array_dyn_name-s according to array_all.
Using eval is HORRIBLE, instead we use structure with dynamic field names:

st = struct();
for ii=1:size(array_all,1)
    nm = sprintf('array_dyn_name%d',ii);
    st.(nm) = array_all(ii,:);
end

It seems like ismember's second output (Locb) might be ordered differently between Matlab and octave (and maybe between newer and older versions of matlab). So, here's an alternative, using :

eq_ = bsxfun( @eq, array_all, permute( x(:), [3 2 1] ) );
eq_ = max( eq_, [], 2 ); %// we do not care at which column of array_all x appeared
[mx firstRowToAppearIn] = max( squeeze(eq_), [], 1 ); 

PS,
Apart from using eval, there is another cavity in your implementation: you do not pre-allocate space for array_all - this may cause your code to run significantly slower than it actually needs to be. See, for example this thread about pre-allocation.

Community
  • 1
  • 1
Shai
  • 111,146
  • 38
  • 238
  • 371
  • I'm getting an out of bounds error in the line array_all(toBePruned,:) = []; %// remove those lines which seems not to take out the duplicate values in array – Rick T Feb 01 '15 at 19:53
  • 1
    @RickT it's a nasty thing about the last line: 1 appears before in the line `[4 2 1]` but this line is also removed. I'm afraid loop is the only way here,,, – Shai Feb 01 '15 at 21:44
  • thanks so much this has really helped so a for loop is what I need to adjust this – Rick T Feb 01 '15 at 21:44
  • Due to an issue popping up with the answer another question has been created as to keep it understandable. Here's a link to the new question that was recommended I include. http://stackoverflow.com/questions/28280544/how-not-to-repeat-values-in-array-using-matlab-octave?noredirect=1#comment44915757_28280544 – Rick T Feb 02 '15 at 15:23