OpenAI Chat Completions API: How do I customize answers from GPT-3.5 or GPT-4 models if I can't fine-tune them?

Question

We have seen some companies use GPT-3.5 or GPT-4 models to train their own data and provide customized answers. But GPT-3.5 and GPT-4 models are not available for fine-tuning.

I've seen the document from OpenAI about this issue, but I had seen OpenAI only allow fine-tuning davinci, for example.

How do I customize answers from a GPT-3.5 or GPT-4 model if I can't fine-tune them?

Or maybe they use the following one?? https://github.com/togethercomputer/OpenChatKit — Lucien, May 03 '23 at 02:29

Rok Benko · Accepted Answer · 2023-07-11T09:47:00.537

3

They don't fine-tune the GPT-3.5 or GPT-4 models.

You have two options.

OPTION 1: Using LlamaIndex or LangChain

What they do is use LlamaIndex (formerly GPT-Index) or LangChain. Both of them enable you to connect OpenAI models with your existing data sources.

OPTION 2: Using the OpenAI Embeddings API endpoint

See my past answer. Also, as @peter_the_oak mentioned, you can use Pinecone to store embedding vectors. Pinecone is designed specifically for handling this type of data.

edited Jul 11 '23 at 09:47

answered May 03 '23 at 08:02

Rok Benko

14,265
2
24
49

1

I think you are right, I have been use langchain to develop applications. And it seens enough to use Langchain to let LLM to access our private data. – Lucien May 04 '23 at 06:10
Is gpt-4 now available for fine tuning? something like: `openai api fine_tunes.create -t "prompt_prepared.jsonl" -m gpt-4` – suribe06 Jul 19 '23 at 20:55
1

@suribe06 As of July 20, 2023, GPT-3.5 and GPT-4 models are still not available for fine-tuning. You can fine-tune the following base models: `davinci`, `curie`, `babbage`, and `ada`. – Rok Benko Jul 20 '23 at 13:54

score 1 · Answer 2 · answered Jul 10 '23 at 15:04

Besides LlamaIndex, there is just the basic combination of Vector Database and LLM. Have a look at Pinecone: https://www.pinecone.io/learn/vector-database/

A vector database stores pieces of text (or pieces of images or sound or else) togtehter with a numeric vector. The numeric vector contains information about the text. A query can be transformed in a numeric vector, too. Now, having two vectors, there are algorithms for finding the one that matches most (e.g. cosine distance). So you can search the database for text that is highly relevant, according to the vector.

Now you can store your "knowledge" in many text/vector pairs. If you get a query, first read the suitable context from the vector database and put the received text in front of the prompt. Like this, the LLM will always have the right context knowledge, together with the custormer query.

Finetuning is overhead for most cases, but prompting is just a very simple and not so powerful solution. Vector Databases use prompting, but offer a mechanism to find suitable prompts, which is a powerful intermediate solution.

OpenAI Chat Completions API: How do I customize answers from GPT-3.5 or GPT-4 models if I can't fine-tune them?

2 Answers2