Questions tagged [large-language-model]

Use this tag for questions about large language models (LLM), trained deep-learning artificial intelligence algorithms that interpret and generate natural language text.

118 questions
5
votes
0 answers

Starcoder finetuning - How to select the GPU and how to estimate the time it will take to finetune

I'd like to finetune Starcoder (https://huggingface.co/bigcode/starcoder) on my dataset and on a GCP VM instance. It's says in the documentation that for training the model, they used 512 Tesla A100 GPUs and it took 24 days. I also saw the model…
5
votes
2 answers

Figuring out general specs for running LLM models

I have three questions : Given count of LLM parameters in Billions, how can you figure how much GPU RAM do you need to run the model ? If you have enough CPU-RAM (i.e. no GPU) can you run the model, even if it is slow Can you run LLM models (like…
sten
  • 7,028
  • 9
  • 41
  • 63
4
votes
1 answer

Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models

What is the difference between instruction tuning and normal fine-tuning for large language models? Also the instruction-tuning I'm referring to isn't the in-context/prompt one. All the recent papers about fine-tuning seem to be about instruction…
Flo
  • 51
  • 1
  • 4
3
votes
4 answers

how to create a langchain doc from an str

I've searched all over langchain documentation on their official website but I didn't find how to create a langchain doc from a str variable in python so I searched in their GitHub code and I found this : doc=Document( …
Mohamed Amine
  • 340
  • 1
  • 4
  • 16
3
votes
1 answer

Finetuning a LM vs prompt-engineering an LLM

Is it possible to finetune a much smaller language model like Roberta on say, a customer service dataset and get results as good as one might get with prompting GPT-4 with parts of the dataset? Can a fine-tuned Roberta model learn to follow…
3
votes
1 answer

Comparing methods for a QA system on a 1,000-document Markdown dataset: Indexes and embeddings with GPT-4 vs. retraining GPT4ALL (or similar)

I am working on a project to build a question-answering system for a documentation portal containing over 1,000 Markdown documents, with each document consisting of approximately 2,000-4,000 tokens. I am considering the following two options: Using…
Vasil Remeniuk
  • 20,519
  • 6
  • 71
  • 81
3
votes
1 answer

How to compute sentence level perplexity from hugging face language models?

I have a large collection of documents each consisting of ~ 10 sentences. For each document, I wish to find the sentence that maximises perplexity, or equivalently the loss from a fine-tuned causal LM. I have decided to use Hugging Face and the…
2
votes
1 answer

Backpropagation / minibatching in training large language models (LLMs)

I am struggling to understand how backprop works for transformer-based LLMs. Here is my guess of how this process works. Given a sequence of tokens with length 64, we process the sequence in parallel using teacher forcing (i.e., for each ACTUAL…
2
votes
0 answers

How to finetune an LLM model on your own codebase?

I have 10 code repositories in Javascript (VueJS) (Each repository corresponds to 1 Theme) I want to train an LLM model on these 10 code repositories so that I can generate new themes using prompts. The LLM model take the context of 10 code…
Aadesh
  • 403
  • 3
  • 13
2
votes
1 answer

How can I load scraped page content to langchain VectorstoreIndexCreator

I have a function which goes to url and crawls its content (+ from subpages). Then I want to load text content to langchain VectorstoreIndexCreator() . How can I do it via loader? I could not find any suitable loader in langchain.document_loaders.…
PetrSevcik
  • 89
  • 1
  • 9
2
votes
2 answers

In Langchain, why ConversationalRetrievalChain not remembering the chat history and Entering new ConversationalRetrievalChain chain for each chat?

I am trying to create an customer support system using langchain. I am using text documents as external knowledge provider via TextLoader In order to remember the chat I using ConversationalRetrievalChain with list of chats My problem is, each time…
RagAnt
  • 1,064
  • 2
  • 17
  • 35
2
votes
1 answer

How to use cross-encoder with Huggingface transformers pipeline?

There're a set of models on huggingface hubs that comes from the sentence_transformers library, e.g. https://huggingface.co/cross-encoder/mmarco-mMiniLMv2-L12-H384-v1 The suggested usage examples are: # Using sentence_transformers from…
2
votes
1 answer

Further finetune a Peft/LoRA finetuned CausalLM Model

I am a bit unsure how to proceed regarding the mentioned topic. The baseline is a model created via Huggingface’s library as an AutoModelForCausalLM model, PEFT and a LoRA approach with subsequent merging of the weights. I now want to further fine…
2
votes
2 answers

Alpaca Large Language Model from Python script

I was able to install Alpaca under Linux and start and use it interactivelly via the corresponding ./chat command. However, I would like to run it not in interactive mode but from a Python (Jupyter) script with the prompt as string parameter. Also,…
2
votes
0 answers

Training huggingface's GPT2 from scratch : how to implement causal mask?

I am trying to train huggingface's implementation of the GPT2 model from scratch (meaning I am using their architecture but not using pre-trained weights) but I noticed by looking into the code here…
1
2 3 4 5 6 7 8