Questions tagged [fine-tune]

156 questions
9
votes
1 answer

OpenAI GPT-3 API: Fine tune a fine tuned model?

The OpenAI documentation for the model attribute in the fine-tune API states a bit confusingly: model The name of the base model to fine-tune. You can select one of "ada", "babbage", "curie", "davinci", or a fine-tuned model created after…
8
votes
1 answer

How do I make sure answers are from a customized (fine-tuning) dataset?

I'm using customized text with 'Prompt' and 'Completion' to train new model. Here's the tutorial I used to create customized model from my data: beta.openai.com/docs/guides/fine-tuning/advanced-usage However even after training the model and sending…
Moshe
  • 208
  • 4
  • 13
5
votes
2 answers

Can i clear up gpu vram in colab

I'm trying to use aitextgen to finetune 774M gpt 2 on a dataset. unfortunately, no matter what i do, training fails because there are only 80 mb of vram available. how can i clear the vram without restarting the runtime and maybe prevent the vram…
Blazeolmo 343
  • 51
  • 1
  • 2
5
votes
0 answers

Fine-tuning BERT sentence transformer model

I am using a pre-trained BERT sentence transformer model, as described here https://www.sbert.net/docs/training/overview.html , to get embeddings for sentences. I want to fine-tune these pre-trained embeddings, and I am following the instructions in…
Fiori
  • 181
  • 1
  • 12
4
votes
1 answer

Target modules for applying PEFT / LoRA on different models

I am looking at a few different examples of using PEFT on different models. The LoraConfig object contains a target_modules array. In some examples, the target modules are ["query_key_value"], sometimes it is ["q", "v"], sometimes something else. I…
ahron
  • 803
  • 6
  • 29
4
votes
1 answer

Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models

What is the difference between instruction tuning and normal fine-tuning for large language models? Also the instruction-tuning I'm referring to isn't the in-context/prompt one. All the recent papers about fine-tuning seem to be about instruction…
Flo
  • 51
  • 1
  • 4
4
votes
2 answers

OpenAI Chat Completions API: How do I customize answers from GPT-3.5 or GPT-4 models if I can't fine-tune them?

We have seen some companies use GPT-3.5 or GPT-4 models to train their own data and provide customized answers. But GPT-3.5 and GPT-4 models are not available for fine-tuning. I've seen the document from OpenAI about this issue, but I had seen…
Lucien
  • 43
  • 3
4
votes
2 answers

Fine-Tuning GPT2 - attention mask and pad token id errors

I have been trying to fine-tune GPT2 on the wikitext-2 dataset (just to help myself learn the process) and I am running into a warning message that I have not seen before: "The attention mask and the pad token id were not set. As a consequence, you…
Toakley
  • 182
  • 3
  • 13
4
votes
3 answers

Can I create a fine-tuned model for OpenAI API Codex models?

I'd like to translate user requests into tickets in some sort of structured data format, e.g. JSON. For example: User: I want to order two chairs and a desk with three drawers on the left side. Output: { "type": "furniture", "items": [ …
xaxa
  • 1,057
  • 1
  • 24
  • 53
4
votes
1 answer

Encoding issues on OpenAI predictions after fine-tuning

I'm following this OpenAI tutorial about fine-tuning. I already generated the dataset with the openai tool. The problem is that the outputs encoding (inference result) is mixing UTF-8 with non UTF-8 characters. The generated model looks like…
3
votes
1 answer

Getting missing pandas error while trying to fine-tune GPT3

I'm using the following command : openai tools fine_tunes.prepare_data -f ./data.jsonl and I'm getting the following error: Analyzing... Traceback (most recent call last): File "/Users/jyothiraditya/mambaforge/bin/openai", line 8, in
JYOTHIR
  • 51
  • 2
3
votes
0 answers

Huggingface: Fine-tuning (not enough values to unpack (expected 2, got 1))

I'm trying to fine-tune erfan226/persian-t5-paraphraser paraphrase generator model for Persian sentences. I used the Persian dataset of tapaco and reformatted it to match the glue (mrpc) dataset which is used in the fine-tuning documentation. I have…
3
votes
1 answer

How can I finetune a model from OpenAI's Whisper ASR on my own training data?

I use OpenAI's Whisper python lib for speech recognition. I have some training data: either text only, or audio + corresponding transcription. How can I finetune a model from OpenAI's Whisper ASR on my own training data?
3
votes
1 answer

EasyOCR - Table extraction

I use easyocr to extract table from a photo or scanned PDF, but I have a problem in fine tuning the data as a table. I try to make a searchable pdf according to extracted coordinates but when I convert it to csv, the lines are not tune. I would…
mahya
  • 31
  • 1
  • 2
3
votes
1 answer

What are the differences between fine tuning and few shot learning?

I am trying to understand the concept of fine-tuning and few-shot learning. I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, recently I have seen a plethora of blog posts…
1
2 3
10 11