0

What am I missing to me they are essentially the same thing. One is my solution the other is Google's which both solve the same problem. However I keep noticing that the Google version is somehow twice as fast as mine. What am I missing?

#mimic dictionary

"""
    match each word in a string to a list of words that will follow it

    example:
        "hi my name is chris, hi chris, hi how are you"
        {hi: ["my","chris","how"], my: ["name"],...}

"""

def mimic_dict(str):
    list1 = str.split()
    dict = {}

    index, end = 0,  len(list1) - 1
    while index < end:
        current, next = list1[index], list1[index + 1]
        if not current in dict:
            dict[current] = [next]
        else:
            dict[current].append(next)
        index += 1 
    return dict

#google    
def mimic_dict1(text):
  mimic_dict = {}
  words = text.split()
  prev = ''
  for word in words:
    if not prev in mimic_dict:
      mimic_dict[prev] = [word]
    else:
      mimic_dict[prev].append(word)
    prev = word
  return mimic_dict
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
  • 4
    Just a quick guess - your version always looks up the element in the list, whereas the Google version stores the actual value, cutting out the need for looking in the list. – Will Richardson Oct 10 '18 at 06:03
  • Possible duplicate of [How can I profile Python code line-by-line?](https://stackoverflow.com/questions/3927628/how-can-i-profile-python-code-line-by-line) – ivan_pozdeev Oct 10 '18 at 06:03
  • you make more operation. try use google solution, but reverse words = words[::-1] before while – Evgeniy Belov Oct 10 '18 at 06:04
  • 2
    As an aside, you shouldn't name a variable `dict` since it is the name of a built-in function. If you try to call that function later, it will attempt to 'call' your variable. – Alex Taylor Oct 10 '18 at 06:05
  • 2
    Define fast. How many iterations have you tried. What was the characteristic of your input data? You are making two extra read calls but that shouldn't matter unless your strings are large (large enough to bring cache coherence in picture) and the data size is large – bashrc Oct 10 '18 at 06:06
  • 3
    Looping directly over a list is *much* faster than using a while-loop with an explicit increment (`i += 1`) and with indexing into the list. This is a well-known Python performance tip. The for-loop / list-iterator does this all implicitly in C-code. I wouldn't have expected twice as fast, though. Also, I'm pretty sure these solutions aren't equivalent, yours will never have a mapping from the `''` empty string to the first word, as far as I can tell – juanpa.arrivillaga Oct 10 '18 at 06:20
  • My own timings show a 3:2 ratio, not a 2:1 – juanpa.arrivillaga Oct 10 '18 at 06:24

0 Answers0