I have to create training data set for named-entity recognition project.
For example, I have text
"Last year, I was in London where I saw Tom"
Training data should be
"Last year, I was in <ENAMEX TYPE="LOCATION">London</ENAMEX> where I saw
<ENAMEX TYPE="NAME">Tom</ENAMEX>"
It is easy to do it by hand but it takes time when there are a large number of data. I can not use an open set. I have small training data set but I should extend it.
How can I create a larger training data set by extending small training data set? Are there some ready packages or open projects for it? Or do you suggest different methods?