How to train Spacy3 project with FP16 mixed precision

Question

The goal is to run python -m spacy train with FP16 mixed precision to enable the use of large transformers (roberta-large, albert-large, etc.) in limited VRAM (RTX 2080ti 11 GB).

The new Spacy3 project.yml approach to training directly uses Huggingface-transformers models loaded via Spacy-transformers v1.0. Huggingface models can be run with mixed precision just by adding the --fp16 flag (as described here).

The spacy config was generated using python -m spacy init config --lang en --pipeline ner --optimize efficiency --gpu -F default.cfg, and checked to be complete by python -m spacy init fill-config default.cfg config.cfg --diff. Yet no FP16 / mixed-precision is to be found.

To reproduce

Use the spaCy Project: Named Entity Recognition (WikiNER) with changed init-config in project.yml to use a GPU and a transformer (roberta-base by default):

commands:
  -
    name: init-config
    help: "Generate a transformer English NER config"
    script:
      - "python -m spacy init config --lang en --pipeline ner --gpu -F --optimize efficiency -C configs/${vars.config}.cfg"

What was tested

Added --fp16 to python -m spacy project run
Added --fp16 to python -m spacy train
Added fp16 = true to default.cfg in various sections ([components.transformer], [components.transformer.model], [training], [initialize])

The logic was transformers are run in FP16 as such:

from transformers import TrainingArguments
TrainingArguments(..., fp16=True, ...)

SW stack specifics

 - spacy              3.0.3
 - spacy-transformers 1.0.1
 - transformers       4.2.2
 - torch              1.6.0+cu101

How to train Spacy3 project with FP16 mixed precision

To reproduce

What was tested

SW stack specifics

0 Answers0