How to restrict llama_index queries to respond only from local data

Question

As given in https://gpt-index.readthedocs.io/en/latest/guides/tutorials/building_a_chatbot.html we wrote a chatbot to index our reference materials and it works fine. The biggest issue it has is that the bot sometimes respond to questions with its own knowledge which are outside the reference manuals.

While this is helpful sometime there are situations where such answers are completely wrong in terms of the context of our reference materials.

Is there a way to restrict the bot to answer only using the indexes we created using our own documents and use the LLM to format the response in a conversational way?

Trajanov Risto · Answer 1 · 2023-05-24T17:04:44.747

You can try evaluating your result with BinaryResponseEvaluator, which will give you a Yes or No if any of the source nodes were used in your response. The documentation says:

This allows you to measure hallucination - if the response does not match the retrieved sources, this means that the model may be "hallucinating" an answer since it is not rooting the answer in the context provided to it in the prompt.

from llama_index import GPTVectorStoreIndex
from llama_index.evaluation import ResponseEvaluator

# build service context
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

# build index
...

# define evaluator
evaluator = ResponseEvaluator(service_context=service_context)

# query index
query_engine = vector_index.as_query_engine()
response = query_engine.query("What battles took place in New York City in the American Revolution?")
eval_result = evaluator.evaluate(response)
print(str(eval_result))

My other suggestion would be to make a custom QuestionAnswering prompt where you will state in your query to state if the answer is not from context. For example:

QA_PROMPT_TMPL = (
"We have provided context information below. \n"
"---------------------\n"
"{context_str}"
"\n---------------------\n"
"Do not give me an answer if it is not mentioned in the context as a fact. \n"
"Given this information, please provide me with an answer to the following:\n{query_str}\n")

Any way we can use the ResponseEvaluator with streaming output — Yaser Sakkaf, Jul 13 '23 at 15:27

score -1 · Answer 2 · answered Apr 30 '23 at 12:05

I think you need to use the SericeContext that allows the content to be serviced from that particular context.

Here is the piece of code which has been developed using this as a reference.

import os
import pickle

from google.auth.transport.requests import Request

from google_auth_oauthlib.flow import InstalledAppFlow
from llama_index import GPTSimpleVectorIndex, download_loader
from langchain import OpenAI
from llama_index import LLMPredictor, GPTVectorStoreIndex, PromptHelper, ServiceContext
from colored import fg

import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.WARN)
os.environ['OPENAI_API_KEY'] = 'xxxxxxxxxxxxxx'


def authorize_gdocs():
    google_oauth2_scopes = [
        "https://www.googleapis.com/auth/documents.readonly"
    ]
    cred = None
    if os.path.exists("token.pickle"):
        with open("token.pickle", 'rb') as token:
            cred = pickle.load(token)
    if not cred or not cred.valid:
        if cred and cred.expired and cred.refresh_token:
            cred.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file("credentials.json", google_oauth2_scopes)
            cred = flow.run_local_server(port=0)
        with open("token.pickle", 'wb') as token:
            pickle.dump(cred, token)


if __name__ == '__main__':

    authorize_gdocs()
    GoogleDocsReader = download_loader('GoogleDocsReader')
    shailesh_doc = 'Some doc id'    # this doc has professional info of person named Shailesh
    pradeep_doc = 'Some doc id' # this doc has professional info of person named Pradeep
    gaurav_doc = 'Some doc id' # this doc has professional info of person named Gaurav
    gdoc_ids = [shailesh_doc, pradeep_doc, gaurav_doc]
    loader = GoogleDocsReader()
    documents = loader.load_data(document_ids=gdoc_ids)

    # define LLM
    llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003"))
    max_input_size = 4096
    num_output = 256
    max_chunk_overlap = 20
    prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

    service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

    index = GPTVectorStoreIndex.from_documents(
        documents, service_context=service_context
    )

    while True:
        red = fg('red')
        print(red)
        prompt = input("Question: ")
        response = index.query(prompt)
        green = fg('green')
        print (green + str(response))

Below is the result of model when asked about people that don't exist in the context. See screenshot as well

Question: Who is Obama?
Obama is not mentioned in the context information, so it is not possible to answer the question.

Question: Who is Narendra Modi?
Narendra Modi is not mentioned in the given context information, so it is not possible to answer the question.

Note: This works for me, but I am open to alternative as well.

Thanks for the response. I use the llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, max_tokens=512)) service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor) commands but it still answers generic questions happily. — Ishan Hettiarachchi, May 02 '23 at 08:57
Then probably I am not sure. For my use case, as output show it is not giving answer to some generic question. — Gaurav Gupta, May 03 '23 at 09:16
Initially for 2-3 weeks it avoided out-of-context responses but since this week the behaviour changed suddenly and it answers generic questions too. — Yaser Sakkaf, Jul 13 '23 at 15:24

How to restrict llama_index queries to respond only from local data

2 Answers2

Linked

Related