Questions tagged [text-generation]
27 questions
3
votes
1 answer
How does `enforce_stop_tokens` work in LangChain with Huggingface models?
When we look at HuggingFaceHub model usage in langchain there's this part that the author doesn't know how to stop the generation, https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_pipeline.py#L182:
class…

alvas
- 115,346
- 109
- 446
- 738
3
votes
3 answers
How does GPT-like transformers utilize only the decoder to do sequence generation?
I want to code a GPT-like transformer for a specific text generation task. GPT-like models use only the decoder block (in stacks) [1]. I know how to code all sub-modules of the decoder block shown below (from the embedding to the softmax layer) in…

mac179
- 1,540
- 1
- 14
- 24
2
votes
1 answer
How to refine a trained model in gpt2?
Im currently trying to work on text generation with my own text. I have trained my model with gpt2 with my own text. But it is giving random answers. For some questions it is giving me relevant answers. Is there a way to fine tune it further or can…

Bhavani Priya
- 39
- 4
2
votes
1 answer
Fine-tuning a pre-trained LLM for question-answering
Objective
My goal is to fine-tune a pre-trained LLM on a dataset about Manchester United's (MU's) 2021/22 season (they had a poor season). I want to be able to prompt the fine-tuned model with questions such as "How can MU improve?", or "What are…

Tom Bomer
- 83
- 7
2
votes
1 answer
Further finetune a Peft/LoRA finetuned CausalLM Model
I am a bit unsure how to proceed regarding the mentioned topic.
The baseline is a model created via Huggingface’s library as an AutoModelForCausalLM model, PEFT and a LoRA approach with subsequent merging of the weights.
I now want to further fine…

Julian Gerhard
- 86
- 1
- 4
2
votes
1 answer
How to save the gpt-2-simple model after training?
I trained the gpt-2-simple chat bot model but I am unable to save it. It's important for me to download the trained model from colab because otherwise I have to download the 355M model each time (see below code).
I tried various methods to save the…

argo
- 21
- 2
1
vote
0 answers
Huggingface Translate Pipe with custom BeamScorer
I want to generate a sentence from a machine translation model with constrained decoding that requires a custom BeamScorer. Is there a way how to replace the standard BeamSearchScorer while using the high-level API such as the Translate pipeline or…

Jindřich
- 10,270
- 2
- 23
- 44
0
votes
0 answers
How to prompt engineer/ context for llm (Code Generation)
I am building a natural language to sql application and for that im using:
https://teknium-replit-v2-codeinstruct-3b.hf.space/
using its api, im generating sql code but I dont know how to provide context or how to style the prompt template so that…

Adi A
- 1
0
votes
0 answers
Tokenizing large text datasets
I am trying to work on a text generation project. I downloaded the WikiBooks dataset from Kaggle:
https://www.kaggle.com/datasets/dhruvildave/wikibooks-dataset
And when I try to create a dataset to tokenize the texts, my kernel crashes because it…
0
votes
0 answers
Title: Generating Sentences with TRL while Maintaining Sentiment - Issue with "AutoModelForCausalLMWithValueHead"
I am currently working on generating sentences with TRL (Transformers Reinforcement Learning) while preserving the same sentiment as the sample sentences. However, I've come across an issue with the TRL code that uses…

user11849691
- 41
- 4
0
votes
0 answers
fastchat-t5-3b-v1.0 gives truncated /incomplete answers
I have used following embeddings:
sentence-transformers/all-mpnet-base-v2
hkunlp/instructor-xl
to get embedding
def getEmbedding():
device = "cuda" if torch.cuda.is_available() else "cpu"
return…

Mukilan
- 1
- 1
0
votes
1 answer
RNN input and output Shape
I’m trying to build an RNN with tf.keras to generate text. Let’s say I have 100 poems from Shakespeare with a max length of 50 words and I’m using 10k English words as my vocab dictionary. Thus, my input shape would be [100, 50, 10k] (by padding all…
0
votes
0 answers
ImportError: cannot import name 'multi_gpu_model' from 'tensorflow.keras.utils' in textgenrnn
I am trying to train a textgenrnn model in python and save the weights, i have a txt file with a list of titles that i want to use. this is my code:
from textgenrnn import textgenrnn
t = textgenrnn()
t.train_from_file(r"filepath goes here",…

Omar Morales Rivera
- 195
- 1
- 8
0
votes
0 answers
Expected scalar type Float but found Half when using Text Gen WebUI with VIcuna & monkey-patch
I am trying to finetune a Vicuna model using text generation webui.
I followed these steps for install as shown in the documentation:
# Install miniconda
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" >…

Martin Becuwe
- 117
- 2
- 9
0
votes
0 answers
InfoGAN reproduction for Cyrillic letters generation
Currently I am trying to reproduce the idea from InfoGAN paper (https://arxiv.org/abs/1606.03657), I use model setups close to one they proposed in the paper for MNIST conditional digits generation. So my problem is that I am trying to use this…

adrian
- 1
- 1