2

I've been researching online for a simple way to create an ordered dictionary and landed on OrderedDict and its update method, I've successfully implemented this once but however now the code tends not to sort on the added terms for example the items being placed are:

      Doc1:  Alpha, zebra, top 
      Doc2:  Andres, tell, exta
      Output: Alpha, top, zebra, Andres, exta, tell
      My goal is to have Alpha, Andres......, top, zebra

This is the code:

    finalindex= collections.OrderedDict()
    ctr=0
    while ctr < docCtr:
        filename = 'dictemp%d.csv' % (ctr,)
        ctr+=1
        dicTempList = io.openTempDic(filename)
        print filename
        for key in dicTempList:
            if key in finalindex:
                print key
                for k, v in finalindex.items():
                newvalue =  v + "," + dicTempList.get(key)
                finalindex.update([(key, newvalue)])
            else:
                finalindex.update([(key, dicTempList.get(key))])
    io.saveTempDic(filename,finalindex)

Can someone please assist me?

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
KSM
  • 262
  • 2
  • 6
  • 16

2 Answers2

5

OrderedDicts remember the order that they were created. If you want it sorted, you need to do that when you create them. Here's how to sort an OrderedDict, an example taken from the docs:

from collections import OrderedDict

d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
sorted_dict = OrderedDict(sorted(d.items(), key=lambda t: t[0]))

This will work with another ordered dict, and I prefer to import the module and reference functions and classes from it for clarity for the reader, so this is done in a slightly different style, but again, to have it sorted, you need to sort it before creating a new OrderedDict:

import collections
ordered_dict=collections.OrderedDict()
ordered_dict['foo'] = 1
ordered_dict['bar'] = 2
ordered_dict['baz'] = 3
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items(), 
                                             key=lambda t: t[0]))

and sorted_dict returns:

OrderedDict([('bar', 2), ('baz', 3), ('foo', 1)])

If lambdas are confusing, you can use operator.itemgetter

import operator
get_first = operator.itemgetter(0)
sorted_dict = collections.OrderedDict(sorted(ordered_dict.items(), 
                                             key=get_first))

I'm using key arguments to demonstrate their usage in case you want to sort by values, but Python sorts tuples (what dict.items() provides to iterate over by means of a list in Python 2 and an iterator in Python 3) by first element then second and so on, so you can even do this and get the same result:

sorted_dict = collections.OrderedDict(sorted(ordered_dict.items()))
Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
2

An ordered dictionary is not a sorted dictionary.

From the documentation 8.3. collections — High-performance container datatypes:

OrderedDict dict subclass that remembers the order entries were added

(emphasis mine)

The ordered dictionary is a hash table backed structure that also maintains a linked list along side it that stores the order of which items are inserted. The dictionary, when iterated over, uses that linked list.

This type of structure is very useful for LRU caches where one wants to only maintain the N most recent items requested, and then evict the oldest one when a new one would push it over capacity.

The code is working correctly.

Some explanation of the design philosophy behind this can be found at Why are there no containers sorted by insertion order in Python's standard libraries? which suggests that the lack of sorted structures confuses the "one obvious way to do it" when it comes to selecting which container you want (compare with all the different types of classes implementing Map, Set and List in Java - do you use a LinkedHashMap? or a ConcurrentSkipListMap? or a TreeMap? or a WeakHashMap?).

Community
  • 1
  • 1