0

I have got this line of code from https://stevenloria.com/tf-idf/

scores = {word: tfidf(word, blob, bloblist) for word in blob.words}

Where the tfidf function is:

import math
from textblob import TextBlob as tb

def tf(word, blob):
    return blob.words.count(word) / len(blob.words)

def n_containing(word, bloblist):
    return sum(1 for blob in bloblist if word in blob.words)

def idf(word, bloblist):
    return math.log(len(bloblist) / (1 + n_containing(word, bloblist)))

def tfidf(word, blob, bloblist):
    return tf(word, blob) * idf(word, bloblist)

In order to better understand the process, I'd like to turn the shorthand loop into a regular-looking "for" loop.

Would this be correct?

scores = []
for word in blob.words:
    scores.append(tfidf(word, blob, bloblist))

Also, what's the advantage or writing short-form for loops?

Lucien S.
  • 5,123
  • 10
  • 52
  • 88
  • 1
    The original line of code is a dictionary comprehension, not a list comprehension, you can tell by the curly braces and the `key:value` pair – NateTheGrate Sep 06 '18 at 13:22

2 Answers2

2

Would this be correct?

That code will give you a list, but your original code produces a dictionary. Instead, try:

scores = {}
for word in blob.words:
    scores[word] = tfidf(word, blob, bloblist)

Also, what's the advantage or writing short-form for loops?

The primary advantage of list comprehensions (and dict comprehensions, etc) is that they are shorter than their long-form equivalents.

Kevin
  • 74,910
  • 12
  • 133
  • 166
1

These are called comprehensions (example is for the most common case, list comprehension, although this one is a dictionary comprehension).

They are often simpler and more readable, but if you find them confusing, it's exactly the same to write it out the way you did, with one difference

scores = {}
for word in blob.words:
    scores[word] = (tfidf(word, blob, bloblist))
Josh Friedlander
  • 10,870
  • 5
  • 35
  • 75