I have few sentences, like,
the film was nice.
leonardo is great.
it was academy award.
Now I want them to be tagged with some standards which may look like,
the DT film NN was AV nice ADJ
leonardo NN is AV great ADJ
it PRP was AV academy NN award NN
I could do it but my goal is to see it as,
[[('the','DT'),('film', 'NN'),('was','AV'),('nice','ADJ')],[('leonardo','NN'),('is','AV'),('great','ADJ')],[('it','PRP'),
('was','AV'),('academy','NN'),('award','NN')]]
that is a list of lists where in each list there is a set of tuples. I could solve each one like I am getting one list with tuples, but not all within one. I wrote the below code,
def entity_tag():
a1=open("/python27/EntityString.txt","r")
a2=a1.read().lower().split()
print "The Original String in List form for Comparison:",a2
a3=open("/python27/EntityDict1.txt","r")
a4=a3.read().split()
list1=[]
list2=[]
for word in a2:
if word in a4:
windex=a4.index(word)
windex1=windex+1
word1=a4[windex1]
word2=word+" "+word1+"$"
list1.append(word2)
elif word not in a4:
word3=word+" "+"NA"+"$"
list1.append(word3)
else:
print "None"
print list1
string1=" ".join(list1)
print string1
stringw=string1.split("$")
print stringw
for subword in stringw:
#print subword
subword1=subword.split()
#print subword1
subwordt=tuple(subword1)
#print subwordt
list2.append(subwordt)
print "The Tagged Word in list:",list2
As it is PoS Tagging so I am not being able to use zip. If any one may please suggest.
I am using Python2.7.11 on MS-Windows 10.