0

I am trying to use max to find output the highest value attached to a key in a python CFD dictionary. I was led to believe with this website (https://www.hallada.net/2017/07/11/generating-random-poems-with-python.html) that max could be used to correctly find the CFD value. However, I have found that it does not seem to get correct results when the frequency of items in the CFD dictionary is changed.

I'm new to python, and I think I may just be confused about how to call data. I tried sorting the list believing I could get the values in the keys to become sorted, but I don't think I quite understand how to do that either.

words = ('The quick brown fox jumped over the '
         'lazy the lazy the lazy dog and the quick cat').split(' ')
from collections import defaultdict
cfd = defaultdict(lambda: defaultdict(lambda: 0))
for i in range(len(words) - 2):  # loop to the next-to-last word
    cfd[words[i].lower()][words[i+1].lower()] += 1

{k: dict(v) for k, v in dict(cfd).items()}
max(cfd['the'])

The most common word following "the" is "lazy." However, python outputs the last word on the CFD dictionary, which is "quick."

wpercy
  • 9,636
  • 4
  • 33
  • 45
  • This question would be improved a lot if you defined the acronym CFD somewhere. For anyone else who's confused like me: CFD = Conditional Frequency Distribution – Blckknght Nov 06 '19 at 20:20

1 Answers1

1

Your issue is that cfd['the'] is a dict, and as max iterates over it raw, it is actually iterating over just the keys. In that case, "quick" is greater than "lazy" because strings.

Change your max to: max(cfd['the'].items(), key=lambda x: x[1])