0

Can someone explain to me the unexpected behavior while performing augmented assignment in for loop in Python in the following example:

  • the following code is doing n_epochs iterations, trying to get to the optimal value of parameters theta. The weird thing is, even though it does the job right and successfully converges, the list theta_list is filled with the n_epochs + 1 values of theta from last iteration of the loop.
eta = 0.1
n_epochs = 5
theta = np.random.rand(2, 1)

loss_list = []
theta_list = []
theta_list.append(theta)
for epoch in range(n_epochs):
    theta -= eta * loss_grad(theta, x, y)
    theta_list.append(theta)
    loss_list.append(loss(theta, np.c_[np.ones_like(x), x], y))
print(theta_list)
In [1]: %run proto.py
[array([[3.6315498 ],
       [3.40173314]]), array([[3.6315498 ],
       [3.40173314]]), array([[3.6315498 ],
       [3.40173314]]), array([[3.6315498 ],
       [3.40173314]]), array([[3.6315498 ],
       [3.40173314]]), array([[3.6315498 ],
       [3.40173314]])]
  • after changing single line in the previous example, from
theta -= eta * loss_grad(theta, x, y)

to

theta = theta - eta * loss_grad(theta, x, y)

the output looks as expected:

In [5]: %run proto.py
[array([[0.00686239],
       [0.06257885]]), array([[1.50998752],
       [1.80281404]]), array([[2.35777388],
       [2.7475853 ]]), array([[2.84342938],
       [3.25403032]]), array([[3.12872463],
       [3.51915101]]), array([[3.30292104],
       [3.65160945]])]

I didn't notice on stackoverflow, or anywhere else online for that matter, that anyone has encountered a similar issue. I am using Windows 10 and Python 3.8.5 installed via Anaconda3.

antelk
  • 29
  • 2
  • Does this answer your question? [Why does += behave unexpectedly on lists?](https://stackoverflow.com/questions/2347265/why-does-behave-unexpectedly-on-lists) – mkrieger1 Jan 14 '21 at 12:35

1 Answers1

0

Numpy arrays support (broadcasted) augmented assignment in-place for efficiency reasons.

Your first example modifies the variable pointed to by the name theta in place.

The second computes theta - eta * loss_grad(theta, x, y),. then assigns it to the name theta.

AKX
  • 152,115
  • 15
  • 115
  • 172