0

OpenAI now allows us to fine-tune GPT-3.5 models. I have tested and fine-tuned the model with my own dataset but the problem is the fine-tuned model generates the answer randomly, not correct based on my custom dataset.

Is there any way to make the model only answer from my own fine-tuned dataset?

Rok Benko
  • 14,265
  • 2
  • 24
  • 49
DangTri
  • 15
  • 4
  • Here's something that might assist you: consider exploring this implementation using LangChain - you can find it at [PrivateDocBot](https://github.com/Abhi5h3k/PrivateDocBot) – Abhi Aug 27 '23 at 15:20

1 Answers1

1

This is a completely wrong approach (as you've already figured out).

As stated in the official OpenAI documentation:

Some common use cases where fine-tuning can improve results:

  • Setting the style, tone, format, or other qualitative aspects
  • Improving reliability at producing a desired output
  • Correcting failures to follow complex prompts Handling many edge cases in specific ways
  • Performing a new skill or task that’s hard to articulate in a prompt

Fine-tuning is not about answering a specific question with a specific answer from the fine-tuning dataset.

What you need to implement is a semantic search based on embeddings, as stated in the official OpenAI documentation:

When should I use fine-tuning vs embeddings with retrieval?

Embeddings with retrieval is best suited for cases when you need to have a large database of documents with relevant context and information.

By default OpenAI’s models are trained to be helpful generalist assistants. Fine-tuning can be used to make a model which is narrowly focused, and exhibits specific ingrained behavior patterns. Retrieval strategies can be used to make new information available to a model by providing it with relevant context before generating its response. Retrieval strategies are not an alternative to fine-tuning and can in fact be complementary to it.

You have two options:

  1. Custom solution (see my past answer).
  2. Using LlamaIndex or LangChain.
Rok Benko
  • 14,265
  • 2
  • 24
  • 49