Questions tagged [openaiembeddings]

24 questions
2
votes
0 answers

I have text data to perform sentiment analysis. With 3 classes I want to create embeddings and get centroids of the data. Any ideas?

I have text data to perform sentiment analysis. With three classes (-1,0,1) I would like to create embeddings and get the centroids of the data so new data can be assigned according to the centroids based on cosine similarity. Any ideas? I am trying…
2
votes
1 answer

Limit tokens per minute in LangChain, using OpenAI-embeddings and Chroma vector store

I am looking for a way to limit the tokens per minute when saving embeddings in a Chroma vector store. Here is my code: [...] # split the documents into chunks text_splitter = CharacterTextSplitter(chunk_size=1500, chunk_overlap=0) texts =…
Heka
  • 73
  • 1
  • 8
1
vote
1 answer

LangChain / OpenAI issue with format of text file from webscraping causing API call to fail for "maximum context length"

I'm attempting to use Retrieval Augmented Generation using LangChain's TextLoader and CharacterTextSplitter. My source data is text data that I've scraped from a customer's website. When scraped without preprocessing, the data is dirty and…
Ja4H3ad
  • 51
  • 1
  • 6
1
vote
0 answers

azure open ai embeddings qna with subset of files uploaded

I am new to Azure OpenAI and below is what I am trying to achieve. I want to use Azure OpenAI services for qna on a pdf file. I have gone through few documents and blogs related to this and this is what I have understood so far. I can upload my…
1
vote
1 answer

How to add memory to load_qa_chain or How to implement ConversationalRetrievalChain with custom prompt with multiple inputs

I am trying to provide a custom prompt for doing Q&A in langchain. I wasn't able to do that with ConversationalRetrievalChain as it was not allowing for multiple custom inputs in custom prompt. Hence, I used load_qa_chain but with load_qa_chain, I…
Jason
  • 676
  • 1
  • 12
  • 34
1
vote
0 answers

How to substitute the OpenAiEmbeddings with Huggingface on Langchain?

const { HuggingFaceInferenceEmbeddings } = require('@huggingface/inference'); const embeddings = new HuggingFaceInferenceEmbeddings({ apiKey: process.env.HUGGINGFACEHUB_API_KEY, model: "hkunlp/instructor-large", }); vectorStore = await…
user42141
  • 33
  • 4
1
vote
1 answer

How can I add collections/object in Chroma database

I'm trying to run few documents through OpenAI’s text embedding API and insert the resulting embedding along with text in the Chroma database locally. sales_data = medium_data_split + yt_data_split sales_store = Chroma.from_documents( …
1
vote
2 answers

Using Embeddings API in Azure OpenAI

When I use embeddings with Azure OpenAI I am getting 404 (resource not found): EmbeddingsOptions embdOptions = new EmbeddingsOptions(text); Azure.AI.OpenAI.Embeddings response = Task.Run(() =>…
Leon
  • 165
  • 12
1
vote
1 answer

embeddings and semantic search in spanish

I'm building an AI assistant that interacts with custom Q&A stored in a vector database. All examples of it shows as a very simple task of chunking documents (QA in this case), creating embeddings, storing them in a vector DB, and then querying when…
Cristian Sepulveda
  • 1,572
  • 1
  • 18
  • 25
1
vote
1 answer

ValueError: could not broadcast input array from shape (1536,) into shape (2000,)

I'm trying to create a Qdrant vectorsore and add my documents. My embeddings are based on OpenAIEmbeddings the QdrantClient is local for my case the collection that I'm creating has the VectorParams as such: VectorParams(size=2000,…
Evan P
  • 1,767
  • 1
  • 20
  • 37
0
votes
1 answer

Call to a member function toArray() on array - predis laravel

I am trying to use redis with laravel to find similar vectors using openai embeddings. I have an example in python that looks like this: def search_similar_documents(self, entity_id, vector, topK=5): query = Query("*=>[KNN 2 @embedding $vec…
Danilo Toro
  • 569
  • 2
  • 15
0
votes
1 answer

Chatbot using csv file

I am trying to create a chatbot using Azure bot service and Azure open ai. The data source is multiple csv files. I am able to create embedding using langchain chroma extension. But while querying the embedding I am not getting the correct…
0
votes
0 answers

Cap to Cracking and Chunking/Vector Embedding?

I am currently using the Azure Machine Learning Python SDK, using the incremental embbedding tutorial that uses Ada 002:…
0
votes
2 answers

Customize word embeddings to your own vocabulary

I have a vocabulary related to restaurant stuff in Spanish and I am using predefined word embeddings in Spanish with FastText and Bert, however, I see that there are a lot of out-of-vocabulary (oov) words that are not recognized by the predefined…
0
votes
2 answers

How does similarity_search_with_score() calculate the scores while retrieving the most similar document from embedding

I am trying to retrieve most similar documents based on question and i am getting top_k =5 docs. but, How does similarity_search_with_score() calculate the scores while retrieving the most similar document from embedding. I want to know the…
1
2