Questions tagged [peft]
16 questions
4
votes
1 answer
Target modules for applying PEFT / LoRA on different models
I am looking at a few different examples of using PEFT on different models. The LoraConfig object contains a target_modules array. In some examples, the target modules are ["query_key_value"], sometimes it is ["q", "v"], sometimes something else.
I…

ahron
- 803
- 6
- 29
4
votes
2 answers
How to load a fine-tuned peft/lora model based on llama with Huggingface transformers?
I've followed this tutorial (colab notebook) in order to finetune my model.
Trying to load my locally saved model
model = AutoModelForCausalLM.from_pretrained("finetuned_model")
yields Killed.
Trying to load model from hub:
yields
import…

Lucas Azevedo
- 1,867
- 22
- 39
2
votes
1 answer
Further finetune a Peft/LoRA finetuned CausalLM Model
I am a bit unsure how to proceed regarding the mentioned topic.
The baseline is a model created via Huggingface’s library as an AutoModelForCausalLM model, PEFT and a LoRA approach with subsequent merging of the weights.
I now want to further fine…

Julian Gerhard
- 86
- 1
- 4
1
vote
1 answer
Llama QLora error: Target modules ['query_key_value', 'dense', 'dense_h_to_4h', 'dense_4h_to_h'] not found in the base model
EDIT:
solved by removing target_modules
I tried to load Llama-2-7b-hf LLM with QLora with the following code:
model_id = "meta-llama/Llama-2-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(model_id, use_auth_token=True) # I have permissions.
model…

Ofir
- 590
- 9
- 19
1
vote
0 answers
Getting CUDA out of memory when calling save_pretrained in a script that tries lora training a large language model using huggingface
I am trying to train a LLama LLM ("eachadea/vicuna-13b-1.1") using LoRA on a LambdaLabs A100 40 GB.
Everything seems to be working fine including the training, however the script fails on the last line:…

Ray Hulha
- 10,701
- 5
- 53
- 53
1
vote
1 answer
big_modeling.py not finding the offload_dir
I'm trying to load a large model on my local machine and trying to offload some of the compute to my CPU since my GPU isn't great (Macbook Air M2). Here's my code:
from peft import PeftModel
from transformers import AutoTokenizer, GPTJForCausalLM,…

Matthew Berman
- 8,481
- 10
- 49
- 98
0
votes
1 answer
How to directly load fine-tuned model like Alpaca-Lora (PeftModel()) from the local files instead of load it from huggingface models?
I have finetuned Llama model using low-rank adaptation (LoRA), based on peft package. The result files adapter_config.json and adapter_model.bin are saved.
I can load fine-tuned model from huggingface by using the following codes:
model =…

a7777777
- 1
- 1
0
votes
0 answers
Questions about distributed finetuning of transformers model (chatglm) with Accelerate in Kaggle GPUs
I am trying to finetune the chatglm-6b model using LoRA with transformers and peft in Kaggle GPUs (2*T4). The model structure:
The traditional loading method (AutoModel.from_pretrained) needs to load the model itself (15 GB) onto CPU first, whereas…

LocustNymph
- 11
- 3
0
votes
0 answers
How to load the finetuned model (merged weights) on colab?
I have finetuned the llama2 model. Reloaded the base model and merged the LoRA weights. I again saved this finally loaded model and now I intend to run it.
base_model = AutoModelForCausalLM.from_pretrained(
model_name,
…

Gaurav Gupta
- 4,586
- 4
- 39
- 72
0
votes
0 answers
Combine base model with my Peft adapters to generate new model
I am trying to merge my fine-tuned adapters to the base model. With this
torch.cuda.empty_cache()
del model
pre_trained_model_checkpoint = "databricks/dolly-v2-3b"
trained_model_chekpoint_output_folder =…

Hanzo
- 1,839
- 4
- 30
- 51
0
votes
0 answers
Lora fine-tuning taking too long
Any reason why this is giving me a month of expected processing time?
More importantly, how to speed this up?
My dataset is a collection of 20k short sentences (max 100 words each).
import transformers
import torch
model_id =…

Lucas Azevedo
- 1,867
- 22
- 39
0
votes
0 answers
HuggingFace - Load/ save PeftConfig as json
I am training fine-tuning a HuggingFace model by adding my own data and using LORA. However, I do not want to upload the file to HuggingFace, but store it on my local computer. This works for the tokenizer and the model, however the LoraConfig…

Andi Maier
- 914
- 3
- 9
- 28
0
votes
0 answers
Error with get_peft_model() and PromptTuningConfig
I am learning how to perform Prompt Tunning and running into a problem.
I am using get_peft_model function to initialize a model for training from 'google/flan-t5-base'
model_name='google/flan-t5-base'
tokenizer =…

David Makovoz
- 1,766
- 2
- 16
- 27
0
votes
1 answer
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' - PEFT Huggingface trying to run on CPU
I am relatively new to LLMs, trying to catch up with it. Following an example I modified the code a bit, to make sure I am running the things locally on an EC2 instance. Training went OK on CPU only, (27 hours), saved model, tokenizer and configs to…

maop
- 194
- 14
0
votes
0 answers
How to improve the output of fine tuned Open Llama 7b model for text generation?
I am trying to fine tune a openllama model with huggingface's peft and lora. I fine tuned the model on a specific dataset. However, the output from the model.generate() is very poor for the given input. When I give a whole sentence form the dataset…

Md Tahmid Hasan Fuad
- 15
- 5