0

I have an dataset array M of size 500x5, is there any way I can use a nested for loop to run through a particular column of the array? If so how would I go about doing that? I want the if statement in the loop to be something like:

if age <= 80
  age = 1
else 
  age = 2
end  

what would I put as the for loop? Would it be better to initialise variables as young =1; old = 2; and then have age = young in the if statement rather than age = 1? I am trying to discretise the data into either 1 or 2 with 1 being young and 2 being old.

Bob Gilmore
  • 12,608
  • 13
  • 46
  • 53
adam mcbrinn
  • 17
  • 1
  • 8
  • Possible duplicate of [Is there a foreach in MATLAB?](http://stackoverflow.com/questions/408080/is-there-a-foreach-in-matlab-if-so-how-does-it-behave-if-the-underlying-data-c). – Scott Solmer Nov 11 '14 at 23:15

2 Answers2

2

Try this:

m = rand(500,5)*100; //your dataset
m(m(:,ii) <= 80) = 1;
m(m(:,ii) > 80) = 2;

where ii is your the column you want to change. E.g. ii = 3

m(m(:,3) <= 80) = 1;
m(m(:,3) > 80) = 2;
Berriel
  • 12,659
  • 4
  • 43
  • 67
  • better use something else instead of `i`. :) http://stackoverflow.com/questions/14790740/using-i-and-j-as-variables-in-matlab – NKN Nov 11 '14 at 23:24
  • that's why I said that the `i`should be changed =] edited so people don't use by mistake – Berriel Nov 11 '14 at 23:30
0

This can be done many ways. You can use nested for loops, like you said, or conditional assignments as Rodrigo suggests.

First, here's your data:

%// your matrix
M = rand(500,5)*100;

You'll want to know the number of rows and columns for the loop...

%// get the size for the loops
[num_rows,num_columns] = size(M);

To simply loop through one column, the following should work:

%// loop through one column (column #2), save one result
col = 2;
for row = 1:num_rows
    if M(row,col) <= 80
        age = 1;
    else
        age = 2;
    end
end

But, you said you wanted to descretize the data, so you'll likely want to change save all the results. The above example will leave you with a variable called 'age' which will only store that last value from the loop.

The following should allow you to save all the results from the individual column:

%// loop through one column (column #2), save all results
col = 2;
%// initialize age array
age = zeros(500,1);
%// do the loop
for row = 1:num_rows
    if M(row,col) <= 80
         age(row) = 1;
    else
        age(row) = 2;
    end
end

To loop through all the columns requires another for loop:

%// loop through all columns, save all results
%// initialize age array
age = zeros(500,5);
%// loop through each column
for col = 1:num_columns
    %// loop through each row
    for row = 1:num_rows
        if M(row,col) <= 80
            age(row,col) = 1;
        else
            age(row,col) = 2;
        end
    end
end

Finally, now that you've seen that, the preferred way for many is to take advantage of MATLAB's conditional assignment tricks. The following will produce the same result as the last code snippet:

%// now without loops
age = zeros(500,5);
age(M <= 80) = 1;
age(M > 80) = 2;

(note that I've used % and // in my comments... you can ignore the // since I only added it so Stack Exchange would recognize my comments)

To answer your follow up, its not necessary to add the bit about young = 1 and old = 2. It is preferred, however, since it allows you to remove the Magic Numbers.

EDIT to answer follow-up:

To save the results in the original array, you can do a few things:

  1. You can assign the new array to a column of the old array at the end
  2. You can assign the new values as you loop
  3. You can use the conditional replacement Rodrigo mentioned.

The first one is easy... use one of the last two procedures above, then do this:

M(:,col_to_replace) = age(:,col_to_replace_with);

Or you could add a new column all together:

M(:,6) = age(:,col_of_interest);

Alternatively, you can just change the loop so the original values are replaced with their discretized value:

%// loop through all columns, save all results in original locations
%// loop through each column
for col = 1:num_columns
    %// loop through each row
    for row = 1:num_rows
        if M(row,col) <= 80
            M(row,col) = 1;
        else
            M(row,col) = 2;
        end
    end
end

Finally, you can just use the conditional replacement method. The sample below will replace all the rows and columns of M with the discretized value:

M(M <= 80) = 1;
M(M > 80) = 2;

To answer your specific example, this will loop through column 1 and save the result in column 3:

%// loop through one column (column #1), save all results in another column (#3)
col = 1;
save_col = 3;
%// do the loop
for row = 1:num_rows
    if M(row,col) <= 80
        M(row,save_col) = 1;
    else
        M(row,save_col) = 2;
    end
end
  • Thank you for your extremely helpful and detailed answer, just one more question; is it possible to save the results in the column of M itself rather than a new array? For example run through column 3 and save the results of 1 or 2 in column 3 of M itself. – adam mcbrinn Nov 12 '14 at 09:59