0

I have the following snippet of code which I ran in python2.7.12 and python3.5.2

f = open(file_name,'r')
file_data= f.read()
f.close()
char_list = list(set(file_data))
c = {char:i for i,char in enumerate(char_list)}
x = {i:char for i,char in enumerate(char_list)}

When ran in python2.7.12 I get the expected result :

    {'a': 0, ' ': 1, 'e': 2, 'i': 3, 'h': 4, '\n': 5, 'o': 6, 'r': 7, 'u': 8, 'w': 9, 'y': 10, '?': 11}
{0: 'a', 1: ' ', 2: 'e', 3: 'i', 4: 'h', 5: '\n', 6: 'o', 7: 'r', 8: 'u', 9: 'w', 10: 'y', 11: '?'}

In python3.5.2, something strange happens. I sometimes get results such as :

 {'h': 1, 'e': 4, 'r': 2, 'i': 3, '?': 0, '\n': 5, ' ': 6, 'u': 7, 'a': 8, 'y': 9, 'o': 10, 'w': 11}
{0: '?', 1: 'h', 2: 'r', 3: 'i', 4: 'e', 5: '\n', 6: ' ', 7: 'u', 8: 'a', 9: 'y', 10: 'o', 11: 'w'}

In addition, in python3.5.2, but not python2.7.12, each time the program is ran char_list is in a different order. It is in the same order every time for python2.7.12.

In both version of python enumerate returns an object that is iterable.

Why would this strange behavior be happening?

P.S. this also happens when I make a copy of char_list and pass the copy into the second enumerate instead of char_list

dylan7
  • 803
  • 1
  • 10
  • 22
  • 1
    Python dicts are not ordered; sort their items if you want easily readable output. Python 3 randomizes dict order to help mitigate certain attacks, and you can turn the same thing on in Python 2 by passing the `-R` flag or setting `PYTHONHASHSEED=random` in the environment. – Ry- May 29 '17 at 02:41
  • @Ryan Well _technically_ the CPython interpreter for Python 3.6 does have ordered dictionaries. Although that is an implementation detail, and should not be relied upon. – Christian Dean May 29 '17 at 02:44
  • Is this also true for the return of `list(set)`? That seems to be randomized too. – dylan7 May 29 '17 at 02:44
  • 2
    @dylan7: Yes, it is. – Ry- May 29 '17 at 02:46
  • I think it's important to note that the ordering is getting shuffled twice, once when you put the letters into a `set`, and then again when you put the enumerated values into dictionaries. The dictionary shuffling is what determines the order of the keys when you print the dictionaries. The set ordering is what determines the correspondence between the letters and the numbers (since you're `enumerate`ing a `list` built from the `set`). Even in Python 3.6 (where `dict`s are now preserve order as an implementation detail), `set`s are still ordered arbitrarily. – Blckknght May 29 '17 at 04:39

2 Answers2

0

Enumerate is working fine, but when you save the data in a dictionary python does not maintain the order. By default Python dictionaries are unordered and are not guaranteed to keep key/value pairs in the same order they were added. OrderedDict is a good solution for Python as you can read about here. Also not that in CPython 3.6 dictionaries will maintain order, but this in not guaranteed in the future.

If you want order preserved consider using a lists or tuples.

KyleV
  • 43
  • 7
0

I have been able to replicate both your problem, and, by making the following changes, get the expected output.

There are two changes to make. Firstly, writing char_list = list(set(file_data)) stores the data as a set, which is an unordered data type - that is, it will not retain the order it is stored necessarily. Therefore, simply removing the set text will solve the order problem.

As for the new line text appearing in the result, simply append .strip('\n') to your file_data= f.read() line and it will automatically remove that text.

After making the changes and confirming they work, your code would look something like this:

f = open(file_name,'r')
file_data= f.read().strip('\n')
f.close()
char_list = list(file_data)
c = {char:i for i,char in enumerate(char_list)}
x = {i:char for i,char in enumerate(char_list)}

Hope this helps!

HunterM267
  • 199
  • 1
  • 12