Starcoder finetuning - How to select the GPU and how to estimate the time it will take to finetune

Asked Jun 01 '23 at 17:22

Active Jun 04 '23 at 09:25

Viewed 438 times

I'd like to finetune Starcoder (https://huggingface.co/bigcode/starcoder) on my dataset and on a GCP VM instance.

It's says in the documentation that for training the model, they used 512 Tesla A100 GPUs and it took 24 days.

I also saw the model (.bin) files in files section of huggingFace (https://huggingface.co/bigcode/starcoder/tree/main)

The total size of the model is ~64GB

Based on all this information,

How do I decide which GPU is best for finetuning on my dataset ?
How to estimate the time it will take finetune ? (based on assumptions on parameters like epoch=1, for instance)
Are there any other factors that are considered to choose hardware / calculate time ?

edited Jun 04 '23 at 09:25

cronoik

asked Jun 01 '23 at 17:22

Aadesh

0 Answers0