6

I have the following three strings (they exist independently but are displayed here together for convenience):

from mx2.x.org (mx2.x.org. [198.186.238.144])
            by mx.google.com with ESMTPS id g34si6312040qgg.122.2015.04.22.14.49.15
            (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
            Wed, 22 Apr 2015 14:49:16 -0700 (PDT)

from HQPAMAIL08.x.org (10.64.17.33) by HQPAMAIL13.x.x.org
 (10.34.25.11) with Microsoft SMTP Server (TLS) id 14.2.347.0; Wed, 22 Apr
 2015 17:49:13 -0400

from HQPAMAIL13.x.org ([fe80::7844:1f34:e8b2:e526]) by
 HQPAMAIL08.iadb.org ([fe80::20b5:b1cb:9c01:aa86%18]) with mapi id
 14.02.0387.000; Wed, 22 Apr 2015 17:49:12 -0400

I'm looking to populate a dict with some values based on the reversed (bottom to top) order of the strings. Specifically, for each string, I'm extracting the IP address as an index of sorts, and then the full string as the value.

Given that order is important, I decided to go with lists, and initially did something like this (pseudocode, with the above bunch of text):

IPs =[]
fullStrings =[]
for string in strings:
    IPs.append[$theIpAddressFoundInTheString]
    fullstrings.append[$theWholeString]

resulting in the following two lists (again, just an illustration):

IPs ['198.186.238.144', '10.64.17.33', 'fe80::7844:1f34:e8b2:e526']

fullstrings ['from mx2.x.org (mx2.x.org. [198.186.238.144])
                by mx.google.com with ESMTPS id g34si6312040qgg.122.2015.04.22.14.49.15
                (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
                Wed, 22 Apr 2015 14:49:16 -0700 (PDT)', 'from HQPAMAIL08.x.org (10.64.17.33) by HQPAMAIL13.x.x.org
     (10.34.25.11) with Microsoft SMTP Server (TLS) id 14.2.347.0; Wed, 22 Apr
     2015 17:49:13 -0400', 'from HQPAMAIL13.x.org ([fe80::7844:1f34:e8b2:e526]) by
     HQPAMAIL08.x.org ([fe80::20b5:b1cb:9c01:aa86%18]) with mapi id
     14.02.0387.000; Wed, 22 Apr 2015 17:49:12 -0400']

This has worked fine up until a point, but now as I begin populating a dict with values in these lists (at hardcoded indices), comparing against values in other lists (again at hardcoded indices) etc., not only does debugging become a pain, the code becomes unsustainable.

I'm beginning to rewrite using a dict (returning a dict where the IP addresses are the keys and the full strings are the values). Then I will perform operations like:

for k,v in myDictOfIpsAndStrings:
    anotherDict[$someHardcodedText] = k
    anotherDict[$otherHardcodedText] = v        

Here's my concern: can I be sure that the dict, any time it is iterated over, will always be done in the order in which the dict was created? If not, is my only option to revert back to lists (and the tedious and brittle length comparisons, assignments inherent in doing so) etc.?

I know that a dict is, by its very nature, unsorted. And I know of the sorted function, but I'm not looking to sort they keys by any descending/ascending order etc. rather it's about maintaining (somehow) the order in which the dict was created.

Pyderman
  • 14,809
  • 13
  • 61
  • 106
  • 1
    Use a `collections.OrderedDict`! – Ry- Jun 11 '15 at 17:08
  • Your question is unclear. What order do you expect to preserve? The order in which you inserted elements? Or the order in which you iterate to not change when you iterate more than once? There is no preservation of insert order, but once you have a dictionary, the order does remain stable until you insert more keys (or delete keys). – Martijn Pieters Jun 11 '15 at 17:14
  • @minitech Thanks. The documentation https://docs.python.org/2/library/collections.html#collections.OrderedDict contains the line "New in version 2.7", but it's buried in the middle of the desctiption. Should I read this as OrderedDict is new in 2.7, or only the popitem() method is new? – Pyderman Jun 11 '15 at 17:15
  • 1
    @Pyderman: OrderedDict is new in 2.7. – Ry- Jun 11 '15 at 17:15
  • @MartijnPieters Yes by order, essentially I mean the order in which elements were inserted, which was intentionally done in a certain order. A little confused though: when you say that insert order is not changed, are you only referring to additional inserts post-creation? Probably also relevant: I'll be populating the list from scratch element by element - is the preservation you refer to not a given in this case (i.e. it only applies if all elements are assigned in the same declaration?) – Pyderman Jun 11 '15 at 17:23
  • 1
    @Pyderman: see [Why is the order in Python dictionaries and sets arbitrary?](https://stackoverflow.com/q/15479928); dictionaries do not preserve the order in which you created it or added keys to it. So entering `{'foo': 1, 'bar': 2, 'baz': 3}` into Python gives you `{'baz': 3, 'bar': 2, 'foo': 1}`, and iteration over that dictionary will always give you `'baz'` first, `'bar'` second and `'foo'` third. Until you insert more keys or delete keys from it. The order in the sample is specific to 2.7 without hash seed randomisation. – Martijn Pieters Jun 11 '15 at 17:26

2 Answers2

9

can I be sure that the dict, any time it is iterated over, will always be done in the order in which the dict was created?

No, a dict is unordered, and will lay out its ordering however the particular implementation decides to.

>>> d = {3: 'c', 2: 'b', 1: 'a'}
>>> d
{1: 'a', 2: 'b', 3: 'c'}

See, immediately after I created the dict the order changed.

If you want to ensure you have a deterministic, controllable order, you can use a collections.OrderedDict

>>> from collections import OrderedDict
>>> d = OrderedDict([(3, 'c'), (2, 'b'), (1, 'a')])
>>> d
OrderedDict([(3, 'c'), (2, 'b'), (1, 'a')])

You can still access the OrderedDict in the conventions you are used to

>>> d[3]
'c'
>>> d.get(3)
'c'

Note that you do not have to insert all of the elements upon creation. You can insert them one at a time if you want.

>>> d = OrderedDict()
>>> d[3] = 'c'
>>> d[2] = 'b'
>>> d[1] = 'a'
>>> d[4] = 'd'
>>> d
OrderedDict([(3, 'c'), (2, 'b'), (1, 'a'), (4, 'd')])
Cory Kramer
  • 114,268
  • 16
  • 167
  • 218
4

You should not rely on the iteration order of a dict. The only way you can get any stable and repeatable ordering is by doing something like:

for key in sorted(yourdict):
   more code here

That will give you a stable ordering, but probably not the one you want.

You probbaly want to use an OrderedDict

Vatine
  • 20,782
  • 4
  • 54
  • 70
  • 2
    You **can** rely on the iteration order of a dict, provided you don't insert or delete anything. What you cannot rely on is the insertion order being preserved. – Martijn Pieters Jun 11 '15 at 17:12
  • @MartijnPieters Ok. See my question above. I guess I will need to populate the dict in one go if I want to to take advantage of this aspect. – Pyderman Jun 11 '15 at 17:27
  • 1
    @MartijnPieters Thanks - this is an important point. Until now, I thought it was the iteration of dicts that caused the change in order. – Jesuisme Jun 11 '15 at 17:38