Questions tagged [huggingface-trainer]
33 questions
4
votes
1 answer
What is the official way to run a wandb sweep with hugging face (HF) transformers so that all the HF features work e.g. distributed training?
Intially I wanted to run a hugging face run such that if the user wanted to run a sweep they could (and merge them with the command line arguments given) or just execute the run with the arguments from command line. The merging is so that the train…

Charlie Parker
- 5,884
- 57
- 198
- 323
3
votes
3 answers
How to fix "Trainer: evaluation requires an eval_dataset" in Huggingface Transformers?
I’m trying to do a finetuning without an evaluation dataset.
For that, I’m using the following code:
training_args = TrainingArguments(
output_dir=resume_from_checkpoint,
evaluation_strategy="epoch",
per_device_train_batch_size=1,
)
def…

An old man in the sea.
- 1,169
- 1
- 13
- 30
2
votes
0 answers
Can I add configuration of 'dropout_rate' to Seq2SeqTrainer?
I'am trying to train T5 model using Seq2SeqTrainer.
I found out that the Config of T5 model is like below.
T5Config {
"_name_or_path": "allenai/tk-instruct-base-def-pos",
"architectures": [
"T5ForConditionalGeneration"
],
…

hyewwns
- 21
- 4
1
vote
1 answer
CUDA out of memory using trainer in huggingface during validation (training is fine)
When doing fine-tuning with Hg trainer, training is fine but it failed during validation. Even reducing the eval_accumation_steps = 1 did not work.
I followed the procedure in the link:
Why is evaluation set draining the memory in pytorch hugging…

Tommy
- 19
- 2
1
vote
1 answer
Validation and Training Loss when using HuggingFace
I do not seem to find an explanation on how the validation and training losses are calculated when we finetune a model using the huggingFace trainer. Does anyone know here to find this information?

tt40kiwi
- 361
- 1
- 8
1
vote
0 answers
Invalid key: 409862 is out of bounds for size 0
How I can fix this:
I writed code for training GPT-2 on dataset by Hugging Face, but I have an error and don't know why I got this error:
---------------------------------------------------------------------------
IndexError …

Vovancho
- 11
- 2
1
vote
1 answer
index Error finetuning an Alpaca fine-tuned model
I’m relatively new to Hugging Face, and I’m facing an error I’m not able to debug when trying to Fine-tune a Vigogne model on my own data.
First of all some context:
I’m running everything in a Jupyter Notebook on AWS SageMaker (Instance…

Marc.ad
- 47
- 6
1
vote
1 answer
How to continue training with HuggingFace Trainer?
When training a model with Huggingface Trainer object, e.g. from https://www.kaggle.com/code/alvations/neural-plasticity-bert2bert-on-wmt14
from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments
import os
os.environ["WANDB_DISABLED"] =…

alvas
- 115,346
- 109
- 446
- 738
0
votes
0 answers
partial-ized forward method for a torch Model does not work well with multi-gpu jobs
I am trying to understand why re-assigning the forward method of a pytorch model object leads to the following error under multi-gpu prediction job (configured automatically by huggingface trainer)
RuntimeError: Expected all tensors to be on
the…

John Jiang
- 827
- 1
- 9
- 19
0
votes
0 answers
Llama+LoRA: training loss straight down to 0 on full dataset (~14k) but ok on sample data (10 samples)
I am trying to fine-tune the LlaMA model with Low-Rank Adaptation (LoRA) based on HuggingFace.
When I train the model on full dataset (~14k), the training loss down to 0 and keep 0 from epoch 2.train loss - full
eval loss - full
But the loss trend…

a7777777
- 1
- 1
0
votes
0 answers
After training the model using SFT, how do I load the model?
I have trained the model with the following code.
from datasets import load_dataset
from trl import SFTTrainer
from transformers import AutoModel, DataCollatorForLanguageModeling, AutoTokenizer, TrainingArguments
from peft import LoraConfig
#…

金坤东
- 1
- 1
0
votes
1 answer
Fine tuning multiclass multilabel wav2vec2 model with transformers
I have managed to adapt the HuggingFace audio classification tutorial to my own dataset:
https://github.com/mirix/messaih/blob/main/charts/fine_tune_w2v.py
I can now fine-tune a wav2vec model on my dataset. I am currently fine tuning a classifier on…

mirix
- 511
- 1
- 5
- 13
0
votes
0 answers
How to load LoRA weights saved locally?
I am currently training a model and have saved the checkpoints for the LoRA adapters. I now have the .bin and .config file for the adapters. How do I reload everything for inference without pushing to huggingFace? Most of the documentation talks…

ASierra
- 11
0
votes
0 answers
How to use the CER on the validation data when training the model using the Trainer API?
I am using Huggingface Trainer API to fine-tune an ASR model, e.g. https://huggingface.co/openai/whisper-tiny
During a Callback function, I call evaluate API to calculate CER metric.
{{code-snippet-needed}} # i.e. What have you…

Ramraj Chandradevan
- 141
- 2
- 10
0
votes
0 answers
example transformers how to implement a custom trainer
Using pythorch and transformers library I am trying to user bert-base-cased for a regression task.
This is how I implement the dataset
class CustomDataset(Dataset):
def __init__(self, data, maxlen, tokenizer, target_cols):
self.df =…

JayJona
- 469
- 1
- 16
- 41