OpenAI Fine-tunes API: Why would I use LlamaIndex or LangChain instead of fine-tuning a model?

Question

I'm just getting started with working with LLMs, particularly OpenAIs and other OSS models. There are a lot of guides on using LlamaIndex to create a store of all your documents and then query on them. I tried it out with a few sample documents, but discovered that each query gets super expensive quickly. I think I used a 50-page PDF document, and a summarization query cost me around 1.5USD per query. I see there's a lot of tokens being sent across, so I'm assuming it's sending the entire document for every query. Given that someone might want to use thousands of millions of records, I can't see how something like LlamaIndex can really be that useful in a cost-effective manner.

On the other hand, I see OpenAI allows you to train a ChatGPT model. Wouldn't that, or using other custom trained LLMs, be much cheaper and more effective to query over your own data? Why would I ever want to set up LlamaIndex?

Rok Benko · Accepted Answer · 2023-07-02T17:13:18.770

Why choose LlamaIndex or LangChain over fine-tuning a model?

The answer is simple, but you couldn't answer it yourself because you were only looking at the costs. There are other aspects as well, not just costs. Take a look at the usability side of the question.

Fine-tuning a model will give the model additional general knowledge, but the fine-tuned model will not (necessarily) give you an exact answer (i.e., a fact) to a specific question. For example, people train an OpenAI model with some data, but when they ask it something related to the fine-tuning data, they are surprised that the model didn't answer with the knowledge gained by fine-tuning, as explained on the official OpenAI forum by @juan_olano:

I fine-tuned a 70K-word book. My initial expectation was to have the desired QA, and at that point I didn’t know any better. But this fine-tuning showed me the limits of this approach. It just learned the style and stayed more or less within the corpus, but hallucinated a lot.

Then I split the book into sentences and worked my way through embeddings, and now I have a very decent QA system for the book, but for narrow questions. It is not as good for questions that need the context of the entire book.

LlamaIndex or LangChain enable you to connect OpenAI models with your existing data sources. For example, a company has a bunch of internal documents with various instructions, guidelines, rules, etc. LlamaIndex or LangChain can be used to query all those documents and give an exact answer to an employee who needs an answer.

OpenAI models can't query their knowledge. The OpenAI model gives an answer based on the statistical probability of which word should follow the previous one. To be able to do so, it needs to be trained on a large chunk of various data. Querying requires calculating embedding vectors and cosine similarity, which OpenAI models can't do. I strongly suggest you to read my previous answer regarding semantic search. You'll understand this answer better.

To sum up:

Use fine-tuning to add some additional general knowledge to the OpenAI model.
Use LlamaIndex or LangChain to get an exact answer (i.e., a fact) to a specific question from existing data sources.

Oh wow, this is the most succinct and clear explanation of embeddings and LLamaIndex I've seen so far. Thank you so much for this! — Curunir The Colorful, Jul 20 '23 at 08:17

OpenAI Fine-tunes API: Why would I use LlamaIndex or LangChain instead of fine-tuning a model?

1 Answers1

Linked