1

I am currently learning about NLP in Python and I am getting problems with a Python syntax.

cfd = nltk.ConditionalFreqDist( #create conditional freq dist
    (target, fileid[:4]) #create target (Y) and years (X)
    for fileid in inaugural.fileids() #loop through all fileids
    for w in inaugural.words(fileid) #loop through each word of each fileids 
    for target in ['america','citizen'] #loop through target
    if w.lower().startswith(target)) #if w.lower() starts with target words
cfd.plot() # plot it

I do not understand the purpose of line 2. Moreover, I do not understand why each loop doesn't end with ":" like any loops in Python.

Can someone explain me this code ? The code works, but I do not fully understand its syntax.

Thank you

ClemHlrdt
  • 73
  • 1
  • 7

1 Answers1

2

The argument of nltk.ConditionalFreqDist is a generator expression.

The syntax is similar to the syntax of a list comprehension: we could have created a list with

[(target, fileid[:4])  for fileid in inaugural.fileids()
                       for w in inaugural.words(fileid) 
                       for target in ['america','citizen'] 
                       if w.lower().startswith(target) ]

and pass it to the function, but using the generator allows it to be more memory efficient, as we don't have to build the whole list before iterating on it. Instead, the (target, ...) tuples get generated one by one while we iterate on the generator object.

You can also have a look at How exactly does a generator comprehension work? for more information on generator expressions.

Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50