3

I realized a difference when I am iterating the dataframe in python 3.5.6 and 3.7.3 and I can't decide if it is good or bad.


import pandas as pd
class Something(object):
    def __init__(self, data):
        self.data = data

data = {
    'col1': [1, 2, 4],
    'col2': [3, 4, None],
    'col3': ['-', None, 2.4],
    'Time': [0, 0.1, 0.5],
}

df = pd.DataFrame(data=data, index=['a', 'b', 'c'])

something = Something(df)

ret = []
for i in something.data:
    ret.append(i)

print(ret)

3.5.6: always return with: ['Time', 'col1', 'col2', 'col3'] (it looks the pandas sorts the keys)

3.7.3: always return with: ['col1', 'col2', 'col3', 'Time'] (it looks the pandas preserves the original order)

I read that since python 3.6+ they preserve insertion order of a dict. (Are dictionaries ordered in Python 3.6+?)

It looks like with the 3.5.6 the pandas sorted the keys, but with 3.7.3 it somehow consider the the original order of the dict (as the 3.7.3 python does) so after the update some of our tests failed, but I can't decide if it this behavior is acceptable or not.

What do you think, is this an issue or not?

Is there anyone who could explain this behavior?

kfr
  • 61
  • 6
  • as I mentioned https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6/39980744, py3.7 dict preserve insertion order – BENY May 21 '19 at 14:19
  • Yes, that's clear. My question is about why does this cause different behavior in pandas? – kfr May 21 '19 at 14:59
  • Finally I got the answer here: https://github.com/pandas-dev/pandas/issues/26481 It is not an issue. – kfr May 22 '19 at 06:58

0 Answers0