0

I am trying to create a chatbot using Azure bot service and Azure open ai. The data source is multiple csv files. I am able to create embedding using langchain chroma extension. But while querying the embedding I am not getting the correct answer.

but if I use create_csv_agent from langchain, I am getting the desired response. Is there any way we can use the CSV embedding and use it? The main reason we can't use the csv_agent is that currently for POC we have the source as csv file but later the source can be either CSV, xls, or pdf file format. We are trying to create a generic embedding process to handle all possible scenarios.

Is there anything special that needs to be done for the embedding process or retrieval process for CSV files?

Any pointer on this would be really helpful.

Thanks in advance.

  • what is the structure of your csv and what kind of data is you are storing in it... – ZKS Aug 19 '23 at 14:57

1 Answers1

0
LOADER_MAPPING = {
    ".csv": (CSVLoader, {}),        
    ".pdf": (PyMuPDFLoader, {}),
    ".txt": (TextLoader, {"encoding": "utf8"}),
} 
   
loader_class, loader_args = LOADER_MAPPING[ext]
loader = loader_class(file_path, **loader_args)
loader.load()

from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')
iVikashJha
  • 159
  • 1
  • 2
  • 14
  • I am able to embedd multiple csv files using chroma and can save it. But the answer are not matching. Note - Both the dataframe is different shape and size. – Anirban Banerjee Aug 28 '23 at 13:28