We would like to build a dictionary of the documentation of the products our company makes, to create a fixed terminology, so we would like to count the frequency of specific words and phrases.
This could be solved in a couple of different ways, but what we would like to solve somehow is to write an XSLT algorithm which can recognize phrases, as specific words occuring together often (so we don't have to specify beforehand all the phrases and all their versions with different conjugations, affixations, etc.).
What do you think, could this task be done with XSLT, or should we look after other solutions?
If anyone has any useful advice how we should start, I would be more than happy to hear about your ideas and have a conversation about this!