3
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
  m = length(y);
  J_history = zeros(num_iters, 1);

  for iter = 1:num_iters
    ## warning: product: automatic broadcasting operation applied
    theta = theta - sum(X .* (X * theta - y))' .* (alpha / (m .* 2));
    J_history(iter) = computeCost(X, y, theta);
  end
end

This is my homework, but I don't ask you to do it for me (I actually think that I've either done it or am close to). I've red the manual where it mentions boradcasting, but I don't understand still, why am I getting a warning here?

2 Answers2

7

The problem is that size(theta') is 1 2 and size(X) is m 2.

When you multiply them, Octave starts by multiplying X(1,1) by theta'(1,1) and X(1,2) by theta'(1,2). Then it moves to the second row of X and tries to multiply X(2,1) by theta'(2,1). But theta' doesn't have a second row so the operation makes no sense.

Instead of just crashing, Octave guesses that you meant to extend theta' so that it has as many rows as X does before beginning the multiplication. However, because it's guessing something, it feels that it should warn you about what it's doing.

You can avoid the warning by explicitly extending the length of theta before you start the multiplication with the repmat function.

repmat(theta',m,1) .* X
Community
  • 1
  • 1
David Tuite
  • 22,258
  • 25
  • 106
  • 176
2

Since the warning says that broadcasting comes from a product operation, it will come from any of .* in the offending line. Which one I can't say without knowing the input values you're given to the function but assuming that:

  1. X is a vector;
  2. alpha is a scalar;
  3. theta is a scalar.

my guess is the warning comes from X .* (X * theta - y))' specially since you're transposing the second part. Try to remove the transpose operator (which may cause an error if there's another bug on it -- I'm assuming that you do not want to perform broadcasting).

carandraug
  • 12,938
  • 1
  • 26
  • 38
  • Oh, I can tell, sorry, I had to be more specific, it's the first `.*`, it is a 97x2 matrix multiplied by 97x1 matrix. I figured later a way to avoid this situation by doing `X' * (X * theta - y)`, but I still would like to know why the warning and what are the consequences. I.e. I don't know whether I want the broadcasting, because I don't know what are the implication of doing so (to me it looks like something I would need or like to do, but maybe it's computationally inefficient or something?) –  Oct 26 '13 at 08:11
  • @wvxvw broadcasting is a very efficient way of doing something that most people solve with for loops (broadcasting is performed automatically when the size of your matrices allows for it, and is the same as calling the function `bsxfun()`. This is a new feature which surprises many people hence the warning). Since you don't know what it is, most likely you don't want it. To understand what broadcasting is, run `[1:5].*[1:5]'` for a simple example and read [NumPy documentation on the subject](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). – carandraug Oct 27 '13 at 17:01
  • Well, it doesn't surprise me, not any more then any other feature of the language that is :) And my code above (although I found a way to rewrite it to avoid the warning) did exactly what it had to do / what I wanted it to do. I just come from a perspective where warning usually means that it is actually an error, but the complier / debugger failed to properly identify it, so now it is warning about something that shouldn't happen in a correct version of the code. And that's strange, because the code didn't seem to blow apart or something like that :) –  Oct 27 '13 at 17:21