3
for subtree3 in tree.subtrees():
  if subtree3.label() == 'CLAUSE':
    print(subtree3)
    print subtree3.leaves()

Using this code I able to extract the leaves of the tree. Which are: [('talking', 'VBG'), ('constantly', 'RB')] for a certain example. That is perfectly correct. Now I want this Tree elements to convert into string or in list for some further processing. How can I do that?

What I tried

for subtree3 in tree.subtrees():
  if subtree3.label() == 'CLAUSE':
    print(subtree3)
    print subtree3.leaves()
    fo.write(subtree3.leaves())
fo.close()

But it throws an error :

Traceback (most recent call last):
  File "C:\Python27\Association_verb_adverb.py", line 35, in <module>
    fo.write(subtree3.leaves())
TypeError: expected a character buffer object

I just want to store the leaves in a text file.

alvas
  • 115,346
  • 109
  • 446
  • 738
Salah
  • 177
  • 1
  • 11
  • What is your input and what do you want as your output? Can you give an example? Also there can be multiple layers in your top-most subtree so depending on what output you require the way you traverse the tree to print it out would be different – alvas Nov 24 '15 at 19:02

2 Answers2

4

It depends on your version of NLTK and Python. I think you're referencing the Tree class in the nltk.tree module. If so, read on.

In your code, it's true that:

  1. subtree3.leaves() returns a "list of tuple" object and,
  2. fo is a Python File IO object, the fo.write only receives a str type as a parameters

you can simply print the tree leaves with fo.write(str(subtree3.leaves())), thus:

for subtree3 in tree.subtrees():
    if subtree3.label() == 'CLAUSE':
        print(subtree3)
        print subtree3.leaves()
        fo.write(str(subtree3.leaves()))
fo.flush()
fo.close()

and don't forget to flush() the buffer.

lguiel
  • 328
  • 7
  • 15
3

Possibly the question is more of trying to write a list of tuples to files instead of traversing the NLTK Tree object. See NLTK: How do I traverse a noun phrase to return list of strings? and Unpacking a list / tuple of pairs into two lists / tuples

To output a list of tuples of 2 strings, I find it useful to use this idiom:

fout = open('outputfile', 'w')

listoftuples = [('talking', 'VBG'), ('constantly', 'RB')]
words, tags = zip(*listoftuples)

fout.write(' '.join(words) + '\t' + ' '.join(tags) + '\n')

But the zip(*list) code might not work if there are multiple levels in your subtrees.

Community
  • 1
  • 1
alvas
  • 115,346
  • 109
  • 446
  • 738
  • I'm glad the answer helps =) – alvas Nov 25 '15 at 13:23
  • @alvas f we have multiple levels in our subtrees, how could we write the tree into a txt file? Because, as you said, it's not possible to use `zip(*list)` in this case – joasa Aug 09 '17 at 12:55