TFF : test accuracy fluctuate

Question

I train a ResNet50 model with TFF, I use test accuracy on test data for evaluation, but I find many fluctuations as shown in the figure below, So please how can I avoid this fluctuation ?

score 2 · Answer 1 · answered Feb 20 '21 at 05:37

I would say behavior such as this is to be expected for stochastic optimization in general. The inherent variance causes you to oscillate somewhere around good solution. The magnitude of the variance and properties of the optimization objective control how much this oscillates when looking at a accuracy metric.

For plain SGD, decreasing learning rate decreases the variance and slows down convergence.

For optimization methods for federated learning, the story is a bit more complicated, but decreasing the client learning rate, or decreasing the number of local steps (while keeping other things the same) can have a similar effect, typically including slowing down convergence. More details can be found in https://arxiv.org/abs/2007.00878 mentioned also in the other answer. Potentially decreasing the client learning rate across rounds could also work. The details can differ also based on what exactly is the optimization method you are using.

score 1 · Answer 2 · answered Feb 20 '21 at 04:51

How is the test accuracy calculated? How many local epochs are the clients training?

If the global model is tested on a held out set of examples, it is possible that clients are detrimentally overfitting during local training. As the global model approaches convergence, each client ends up training a model that works well for them individually, but may be diverging from the optimal global model (sometimes called client drift https://arxiv.org/abs/1910.06378). This is may occur when the client's local dataset has a distribution very different from the global distribution and more likely when the client learning rates are high (https://arxiv.org/abs/2007.00878).

Decreasing the client learning rate, reducing the number of steps/batches, and other methods that cause the clients to do less "work" per communication round may reduce the fluctuation.

Thanks for your answer, I can decide the number of rounds from curve? — seni, Feb 25 '21 at 07:01

TFF : test accuracy fluctuate

2 Answers2