So, df['date']
returns:
0 2018-03-01
1 2018-03-01
2 2018-03-01
3 2018-03-01
4 2018-03-01
...
469796 2018-06-20
469797 2018-06-20
469798 2018-06-27
469799 2018-06-27
469800 2018-12-06
Name: date, Length: 469801, dtype: datetime64[ns]
And, df['date'].sort_values()
returns:
137241 2018-01-01
378320 2018-01-01
247339 2018-01-01
34333 2018-01-01
387971 2018-01-01
...
109278 2018-12-06
384324 2018-12-06
384325 2018-12-06
109282 2018-12-06
469800 2018-12-06
Name: date, Length: 469801, dtype: datetime64[ns]
Now df['date'].sort_values()[0]
"ignores sorting" and returns:
Timestamp('2018-03-01 00:00:00')
Whereas df['date'].sort_values()[0:1]
actually returns:
137241 2018-01-01
Name: date, dtype: datetime64[ns]
Why the apparently inconsistent behaviour? As @cs95 accurately pointed out they return a scalar and a Series respectively, which is okay. I am confused about the value, the first one is 2018-03-01
while the second one is 2018-01-01
.
Thanks in advance.
Warning
Somehow similar to: why sort_values() is diifferent form sort_values().values