1

I would like to implement the pruning algorithms in Tensorflow. In order to do this I need not only to mask the weight during the pruning stage, but keep some entries of this tensor frozen to zero, during the consequent training.

Like I had ([0, 0] and [1, 1] entries masked):

[[0 1.3 2], [1.34 0 2.3]]

And after several batches I expect to still have zeros on positions [0, 0] and [1, 1] .

There is a solution proposed here How to stop gradient for some entry of a tensor in tensorflow. But seems like it works only for TensorFlow v1, because the masked entries of the variable were updated after calling the fit method.

It is possible to create special class of Optimizer that will have redefined apply_gradients method, where the gradient will be multiplied by mask after each backward pass, but this solution seems to be rather inconvenient and one has redefine multiple optimizers - Adam, RMSProp, whatever.

  • See if this helps: https://stackoverflow.com/questions/35298326/freeze-some-variables-scopes-in-tensorflow-stop-gradient-vs-passing-variables – NVS Abhilash Aug 22 '21 at 03:45
  • @NVSAbhilash it is solution to a different problem - freeze backward pass along some path in a graph. In my case, the gradient propagates through the tensor, but only for elements where `mask=1` – spiridon_the_sun_rotator Aug 22 '21 at 08:37

2 Answers2

0

Could Tensorflow Model Optimization be what you are looking for?

It already has implementations for pruning, which works similarly to the masking technique you described (pruning_impl.py).

0

Probably, there is a more clever solution, but the actual way to do this, and how it is done in tfmot is to multiply the weight by mask of zeros and ones after each forward pass.

Say one has tensor:

[[0 1.3 2], [1.34 0 2.3]]

And zeros at positions [0, 0] and [1, 1] need to be preserved.

Then, for every operation one needs to multiply this tensor by the following mask:

[[0 1 1], [1 0 1]]

This will do the job, since the gradient will not flow through these entries.