I am trying to configure SUTime annotator (part of "ner") to use my own date/time rule files INSTEAD of the out-of-the-box rule files that are located in "models/sutime/" in the distribution JAR for Stanford CoreNLP models.
The reason for me doing that is that I want to slightly modify what SUTime rules are doing.
According to the official SUTime documentation, all it takes is specifying the "sutime.rules" property in the form of comma-separated file paths.
But after I did that, it appears that CoreNLP still takes the out-of-the-box rule files:
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
I tried the absolute paths and the paths relative to my project root - still the same effect.
It appears that, contrary to the documentation, the "sutime.rules" property is simply getting ignored.
Any help will be greatly appreciated.
UPDATE:
The workaround in the form of:
- turning off SUTime as a part of the "ner" step
- copying its rule files and modifying them as necessary
- creating a custom annotator based on the TimeAnnotator class and adding it to the pipeline
- setting the .rules properties to the modified rule files
does not work.
The pipeline runs, but the functionality is not the same. The TimeAnnotator constructor needs to be invoked with the "sutime" parameterin order for its functionality to be exactly the same as if it was being called in the "ner" step.
This cannot be done via properties, it seems.