2

I've been thrown into the deep end a bit with a task at work. I need to use DistilBERT for a multi-class text classification problem, but here's the kicker the dataset is gigantic - we're talking millions of samples!

I've been messing around with it, and DistilBERT does seem to do the job well. However, training takes forever So, here are my dilemmas:

Model Training: How can I make DistilBERT handle this beast of a dataset more efficiently? Anyone got experience tweaking the training strategy, batch size, learning rate, etc.? Hardware Constraints: Any hardware magic tricks to pull off? Is splurging on a fancy GPU the only way, or are there some tricks I don't know about? Inference Speed: I also need to make sure the model can quickly classify new data after training. What are my options?

Any help would be a lifesaver!

Ilya
  • 1
  • 5
  • 18
  • on the hardware constraints - if you have a mac with with graphics card, you can try installing tensorflow-metal https://developer.apple.com/metal/tensorflow-plugin/ – Jithin P James Jul 25 '23 at 06:24

0 Answers0