You can use the Trainer
from transformers to train the model. This trainer will also need you to specify the TrainingArguments
, which will allow you to save checkpoints of the model while training.
Some of the parameters you set when creating TrainingArguments
are:
save_strategy
: The checkpoint save strategy to adopt during training. Possible values are:
- "no": No save is done during training.
- "epoch": Save is done at the end of each epoch.
- "steps": Save is done every save_steps.
save_steps
: Number of updates steps before two checkpoint saves if save_strategy="steps".
save_total_limit
: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints in output_dir.
load_best_model_at_end
: Whether or not to load the best model found during training at the end of training.
One important thing about load_best_model_at_end
is that when set to True, the parameter save_strategy
needs to be the same as eval_strategy
, and in the case it is “steps”, save_steps
must be a round multiple of eval_steps.