Questions tagged [bert-language-model]

BERT, or Bidirectional Encoder Representations from Transformers, is a method of pre-training language representations which obtains state-of-the-art results on a wide array of Natural Language Processing (NLP) tasks. BERT uses Transformers (an attention mechanism that learns contextual relations between words or sub words in a text) to generate a language model.

BERT, or Bidirectional Encoder Representations from Transformers, is a method of pre-training language representations used for a wide array of Natural Language Processing (NLP) tasks.

The academic paper can be found here. And the original implementation of the BERT by google can be found here.

Reference

BERT Paper: https://arxiv.org/abs/1810.04805.

BERT Implementation: https://github.com/google-research/bert

1808 questions
80
votes
9 answers

How to use Bert for long text classification?

We know that BERT has a max length limit of tokens = 512, So if an article has a length of much bigger than 512, such as 10000 tokens in text How can BERT be used?
user1337896
  • 1,081
  • 1
  • 10
  • 15
44
votes
9 answers

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)

I got the following error when I ran my PyTorch deep learning model in Google Colab /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias) 1370 ret = torch.addmm(bias, input, weight.t()) 1371 …
Mr. NLP
  • 891
  • 1
  • 8
  • 20
42
votes
2 answers

Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further downstream task. Bert's last layer looks like this…
39
votes
3 answers

dropout(): argument 'input' (position 1) must be Tensor, not str when using Bert with Huggingface

My code was working fine and when I tried to run it today without changing anything I got the following error: dropout(): argument 'input' (position 1) must be Tensor, not str Would appreciate if help could be provided. Could be an issue with the…
Tashinga Musanhu
  • 401
  • 1
  • 4
  • 3
38
votes
5 answers

ValueError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] - Tokenizing BERT / Distilbert Error

def split_data(path): df = pd.read_csv(path) return train_test_split(df , test_size=0.1, random_state=100) train, test = split_data(DATA_DIR) train_texts, train_labels = train['text'].to_list(), train['sentiment'].to_list() test_texts,…
30
votes
6 answers

How to cluster similar sentences using BERT

For ElMo, FastText and Word2Vec, I'm averaging the word embeddings within a sentence and using HDBSCAN/KMeans clustering to group similar sentences. A good example of the implementation can be seen in this short article:…
22
votes
1 answer

How does max_length, padding and truncation arguments work in HuggingFace' BertTokenizerFast.from_pretrained('bert-base-uncased')?

I am working with Text Classification problem where I want to use the BERT model as the base followed by Dense layers. I want to know how does the 3 arguments work? For example, if I have 3 sentences as: 'My name is slim shade and I am an aspiring…
21
votes
1 answer

PyTorch: RuntimeError: Input, output and indices must be on the current device

I am running a BERT model on torch. It's a multi-class sentiment classification task with about 30,000 rows. I have already put everything on cuda, but not sure why I'm getting the following run time error. Here is my code: for epoch in…
Roy
  • 924
  • 1
  • 6
  • 17
21
votes
1 answer

PyTorch BERT TypeError: forward() got an unexpected keyword argument 'labels'

Training a BERT model using PyTorch transformers (following the tutorial here). Following statement in the tutorial loss = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask, labels=b_labels) leads to TypeError: forward() got an…
PinkBanter
  • 1,686
  • 5
  • 17
  • 38
20
votes
3 answers

Cased VS uncased BERT models in spacy and train data

I want to use spacy's pretrained BERT model for text classification but I'm a little confused about cased/uncased models. I read somewhere that cased models should only be used when there is a chance that letter casing will be helpful for the task.…
Oleg Ivanytskyi
  • 959
  • 2
  • 12
  • 28
18
votes
1 answer

BertForSequenceClassification vs. BertForMultipleChoice for sentence multi-class classification

I'm working on a text classification problem (e.g. sentiment analysis), where I need to classify a text string into one of five classes. I just started using the Huggingface Transformer package and BERT with PyTorch. What I need is a classifier with…
18
votes
6 answers

AttributeError: module 'torch' has no attribute '_six'. Bert model in Pytorch

I tried to load pre-trained model by using BertModel class in pytorch. I have _six.py under torch, but it still shows module 'torch' has no attribute '_six' import torch from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM #…
Ruitong LIU
  • 181
  • 1
  • 1
  • 3
17
votes
5 answers

Pytorch: IndexError: index out of range in self. How to solve?

This training code is based on the run_glue.py script found here: # Set the seed value all over the place to make this reproducible. seed_val =…
sylvester
  • 203
  • 1
  • 2
  • 7
16
votes
5 answers

Transformer: Error importing packages. "ImportError: cannot import name 'SAVE_STATE_WARNING' from 'torch.optim.lr_scheduler'"

I am working on a machine learning project on Google Colab, it seems recently there is an issue when trying to import packages from transformers. The error message says: ImportError: cannot import name 'SAVE_STATE_WARNING' from…
16
votes
2 answers

Difficulty in understanding the tokenizer used in Roberta model

from transformers import AutoModel, AutoTokenizer tokenizer1 = AutoTokenizer.from_pretrained("roberta-base") tokenizer2 = AutoTokenizer.from_pretrained("bert-base-cased") sequence = "A Titan RTX has 24GB of…
Mr. NLP
  • 891
  • 1
  • 8
  • 20
1
2 3
99 100