What the point in the guide suggests is that multilingual models fine-tuned using SetFit
method generalize well even on languages they did not see during the SetFit
fine-tuning process. This seems to be generally true for multilingual language models but it probably does not do any damage to mention it explicitly, particularly when discussing SetFit
, which is a method which usually works with a very small dataset (i.e. the dataset that might not be multilingual).
The finding is supported by the paper mentioned in the guide, where researchers show that model fine-tuned on English data using SetFit
performs well on variety of languages (see table 4).
What I would take from it is this: if you fine-tune multilingual checkpoint (e.g. sentence-transformers/paraphrase-multilingual-mpnet-base-v2
) and fine-tune it on French, it will perform well on French and probably will also perform well on other languages. If you plan to use the fine-tuned model only on French texts, you certainly can and try to fine-tune a specifically French model - however, it's certainly not true that you must do this.
However, if there exists a specifically French sentence transformer and you want to use your model only on French texts, I would recommend using the French model. Not because you must, but because it might perform better than the multilingual model.