I am new to the text mining. I have a CSV file. I need to go through each line and extract some information then write them into another CSV file. I am looking for specific information which I have in a dictionary. Consider below sentence:
"the application version is 1.8.2 and the variable skt.len passes the required information. file ReadMe.txt has the specifications."
My dictionary is: ["application version", "variable", "file"]
I need to extract:
- application version: 1.8.2
- variable: skt.len
- file: ReadMe.txt
What is the best way to extract such information from text? I am playing with NLTK and StanfordCoreNLP features. But, I could not extract the information yet. I am thinking to use regex to extract the application version. Any idea?
PS: I know that this may make the task more complicated. But, sentences in each line of the CSV file may have different structures. For example: "application version" in one line, may be "app version" in another line. Or "file" in one line may be "filename" in another line.