The goal is to run python -m spacy train
with FP16 mixed precision to enable the use of large transformers (roberta-large
, albert-large
, etc.) in limited VRAM (RTX 2080ti 11 GB).
The new Spacy3 project.yml approach to training directly uses Huggingface-transformers models loaded via Spacy-transformers v1.0. Huggingface models can be run with mixed precision just by adding the --fp16
flag (as described here).
The spacy config was generated using python -m spacy init config --lang en --pipeline ner --optimize efficiency --gpu -F default.cfg
, and checked to be complete by python -m spacy init fill-config default.cfg config.cfg --diff
. Yet no FP16 / mixed-precision is to be found.
To reproduce
Use the spaCy Project: Named Entity Recognition (WikiNER) with changed init-config
in project.yml
to use a GPU and a transformer (roberta-base
by default):
commands:
-
name: init-config
help: "Generate a transformer English NER config"
script:
- "python -m spacy init config --lang en --pipeline ner --gpu -F --optimize efficiency -C configs/${vars.config}.cfg"
What was tested
- Added
--fp16
topython -m spacy project run
- Added
--fp16
topython -m spacy train
- Added
fp16 = true
todefault.cfg
in various sections ([components.transformer], [components.transformer.model], [training], [initialize]
)
The logic was transformers
are run in FP16 as such:
from transformers import TrainingArguments
TrainingArguments(..., fp16=True, ...)
SW stack specifics
- spacy 3.0.3
- spacy-transformers 1.0.1
- transformers 4.2.2
- torch 1.6.0+cu101