2

I need to build a question-answering system on a specific domain of Finance, I have documents data containing all the information about the field,

Can I fine-tune T5 pre-trained model (large) unsupervised training on the documents so it can answer related questions based on my documents corpus?
The documents corpus I have is quite large, so I cannot just use it as a context in the current QA within T5,

I am open to your suggestions!

1 Answers1

0

What I found is that it is not really feasible to fine-tune T5 LLM word embeddings, you can only use context or fine-tune the model on a dataset of QA, but not retrain the model on a specific domain like finance which was my case,
I ended up building the QA system using Haystack which is an open-source library offering project architecture to build NLP QA systems based on transformers you can specify
https://github.com/deepset-ai/haystack

  • your post is very useful and also relevant to mine: https://stackoverflow.com/questions/76373220/fine-tuning-a-pre-trained-llm-for-question-answering. I'd be very grateful if you could take a look at my post when you have the time and to offer any advice you can. thank you, Tom – Tom Bomer May 31 '23 at 12:06