Questions tagged [llm]

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

A large language model is characterized by its large size. Their AI accellerator networks are able to process huge amounts of text data, usually scraped from the internet.

200 questions
5
votes
0 answers

How do I deploy a real-time Llama 2 endpoint on Azure?

I've been reading up a lot on Open Source LLMs and with the recent release of Llama 2, I've a question. Since Llama 2 is on Azure now, as a layman/newbie I want to know how I can actually deploy and use the model on Azure. I want to create a…
5
votes
0 answers

Langchain - Can't solve the dynamic filtering problem from vectorstore

I am using Langchain version 0.218, and was wondering if anyone was able to filter a seeded vectorstore dynamically during runtime? Such as when running by a Agent. My motive is to put this dynamic filter in a Conversational Retrieval QA chain,…
5
votes
1 answer

In LangChain, how to save the verbose output to a variable?

I tried executing a langchain agent. I want to save the output from verbose into a variable, but all I can access from the agent.run is only the final answer. How can I save the verbose output to a variable so that I can use later? My code: import…
Troyanovsky
  • 53
  • 1
  • 3
4
votes
1 answer

module 'chainlit' has no attribute 'langchain_factory'

I downloaded the repo: https://github.com/menloparklab/falcon-langchain I created a virtualenv for this one to install the requirments.txt and run the application. After I run the application using the following command. chainlit run app.py -w But…
cetusian
  • 45
  • 4
4
votes
2 answers

Why does llama-index still require an OpenAI key when using Hugging Face local embedding model?

I am creating a very simple question and answer app based on documents using llama-index. Previously, I had it working with OpenAI. Now I want to try using no external APIs so I'm trying the Hugging Face example in this link. It says in the example…
4
votes
0 answers

How to protect your github code against being used for llm training

The recent rise of LLMs trained on vast amounts of data, including OS repositories, leads to a licensing question. Suppose you are OK for other human developers to build on your code but do not want to allow the likes of OpenAI to use your code for…
3
votes
2 answers

Llama-2 7B-hf repeats context of question directly from input prompt, cuts off with newlines

Context: I am trying to query Llama-2 7B, taken from HuggingFace (meta-llama/Llama-2-7b-hf). I give it a question and context (I would guess anywhere from 200-1000 tokens), and ask it to answer the question based on the context (context is retrieved…
3
votes
0 answers

Why do I get an inconsistent memory error when loading Llama-2 from huggingface

I'm playing around with the new Llama-2 7B model, and running it on a 16GM RAM M1 pro Mac. If I load the model, Python crashes with a memory error - unless I load it via hf pipelines. I don't believe this to be a hf issue but rather something weird…
3
votes
0 answers

Dynamically add more embedding of new document in chroma DB - Langchain

I have created a retrieval QA Chain which uses chromadb as vector DB for storing embeddings of "abc.txt" file. What if I want to dynamically add more document embeddings of let's say another file "def.txt"? How to do that? I don't want to reload the…
Jason
  • 676
  • 1
  • 12
  • 34
3
votes
1 answer

How to improve/preprocess text (in special cases) so the embeddings and LLM will have better context?

I have been working on setting up local documents to be ingested into vectordb and then to be used (embeddings) as context for the LLM. Problem is, local documents are very much high level (check below more details). After it's chunked with…
3
votes
1 answer

How does `enforce_stop_tokens` work in LangChain with Huggingface models?

When we look at HuggingFaceHub model usage in langchain there's this part that the author doesn't know how to stop the generation, https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_pipeline.py#L182: class…
alvas
  • 115,346
  • 109
  • 446
  • 738
2
votes
1 answer

Backpropagation / minibatching in training large language models (LLMs)

I am struggling to understand how backprop works for transformer-based LLMs. Here is my guess of how this process works. Given a sequence of tokens with length 64, we process the sequence in parallel using teacher forcing (i.e., for each ACTUAL…
2
votes
0 answers

GGML vs GPTQ vs bitsandbytes

What are the core differences between how GGML, GPTQ and bitsandbytes (NF4) do quantisation? Which will perform best on: a) Mac (I'm guessing ggml) b) Windows c) T4 GPU d) A100 GPU So far, I've run GPTQ and bitsandbytes NF4 on a T4 GPU and…
2
votes
1 answer

How to connect LLM to Multiple SQL database with LangChain SQLChain

I want to connect LLM to more than 2 SQL database with SQLDatabaseChain. Please let me know if it is possible or any other way. I have connected to 1 DB and able to work. bUt my case i need to connect to more than 1 DB and its table
2
votes
0 answers

How to you add context to be passed along with agent.run in ReAct LangChain framework

I've previously built a pdf searching tool in LangChain which uses the chain.run(input_documents=, question=) syntax to ask the model questions along with context from that pdf. I want to integrate this with the agents provided by langchain. I am…
1
2 3
13 14