I am trying to train a SRL model for German text by translating Ontonotes dataset and propagating the labels from English sentences to German sentences. When i train the model with this dataset, as well a manually annotated dataset i seem to be stuck at maximum F1 score of 0.62. I am using deepset/gbert-large bert model for training with learning rate 5e-5. I have updated the Ontonotes.py file to read the conll formatted files and i checked the srl frames to ensure the labels are being picked up correctly. Is there something else i am missing out which i need to take care while trying to train a model in different language or is it just the low quality of data which might be causing the issue.
I have also tried to manually annotate german sentences for the SRL task and even for such high quality data the model does not seem to perform as it equivalent BERT model for English performs. Although the quality of the dataset created by translating and transferring labels might be low, does it still explain a difference of 0.24 F1 score ?