0

I'm trying to leverage the openAI gpt API to create a custom site that would be able to make brand recommendations based on users questions. Some examples of users questions are:

  • Which coffee brands offer organic or fair trade options?
  • I'm traveling to Hawaii this fall and need help finding lodging and things to do. I'd like to prioritize working with eco-friendly travel brands. Any tips?
  • What is the most eco-friendly toilet paper brand?

Due to resource limitation, I did not take the route of fine tuning the model using question and answer pairs, but instead I created an index from our own brand data points about brands. The data about brands include the brand’s name, the category they belong, their scores calculated by our own metrics, their shopping links, attributes and certifications, and descriptions about the brand, and I used a formula to connect the data points into a paragraph about the brands. I then used llamaindex to create the custom index, and here are the configuration that I used:

llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="text-davinci-002"))

max_input_size = 4096
num_output = 250
max_chunk_overlap = 20
chunk_overlap_ratio = 0.1

prompt_helper = PromptHelper(max_input_size, num_output, chunk_overlap_ratio, max_chunk_overlap)

custom_LLM_index = GPTVectorStoreIndex.from_documents(
    documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper
)

custom_LLM_index.storage_context.persist(persist_dir="./index")

I then used llamaindex to allow users to input their questions and query from the index created, and included a prompt that ideally would make the responses include certain information that I want to display. Here‘s the initial Prompt: For any brands that you recommend, tell me about details about the types of products or services they sell and their impact, and give me the brand's shopping link, the brand's planet score and people score, and its URL on [our site] if it exists.

But a big issue that came up following this is that the responses are generally in the format I want, but gpt appears to be hallucinating quite often, and recommends brands that are not a part of the index, and fills in wrong information for the specific attributes asked in the prompt.

So trying to solve this, I added a new condition to the prompt: Don’t justify your answers. Don’t give information not mentioned in the CONTEXT INFORMATION. If you can't find a good answer, just say I don't have the best answer and ask users to perform a search on [our site].

This seems to help with the hallucinations, but at the same time it also does not give out good answers from information that is included in the index. For example, when asked questions like: “What is a good sunglass brand with a positive social impact?”, it responds “I don’t know”.

So what I am really curious about is how to enhance the responses of the model given the limitations that I have, so that I could limit hallucinations, but also be able to provide accurate information that is included in the index?

  • Does this answer your question? [How to restrict llama\_index queries to respond only from local data](https://stackoverflow.com/questions/76017774/how-to-restrict-llama-index-queries-to-respond-only-from-local-data) – Trajanov Risto Jun 30 '23 at 22:52

0 Answers0