I have been allocated multiple Google Cloud TPUs in the us-central1-f
region. The machine types are all v2-8
.
How can I utilize all my TPUs to train a single model?
The us-central1-f
region doesn't support pods, so using pods doesn't seem like the solution. Even if pods were available, the number of v2-8 units that I have does not match any of the pod TPU slice sizes (16, 64, 128, 256), so I couldn't use them all in a single pod.