9

I am trying to do the following with python and am having a strange behavior. Say I have the following list:

x = [5, 4, 3, 2, 1]

Now, I am doing something like:

x[x >= 3] = 3

This gives:

x = [5, 3, 3, 2, 1]

Why does only the second element get changed? I was expecting:

[3, 3, 3, 2, 1]
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
Luca
  • 10,458
  • 24
  • 107
  • 234

3 Answers3

19

Because Python will evaluated the x>=3 as True and since True is equal to 1 so the second element of x will be converted to 3.

For such purpose you need to use a list comprehension :

>>> [3 if i >=3 else i for i in x]
[3, 3, 3, 2, 1]

And if you want to know that why x >= 3 evaluates as True, see the following documentation :

CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address.

In python-2.x and CPython implementation of course, a list is always greater than an integer type.As a string is greater than a list :

>>> ''>[]
True

In Python-3.X, however, you can't compare unorderable types together and you'll get a TypeError in result.

In [17]: '' > []
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-052e7eb2f6e9> in <module>()
----> 1 '' > []

TypeError: unorderable types: str() > list()
Mazdak
  • 105,000
  • 18
  • 159
  • 188
6

You can use this syntax with Numpy:

>>> import numpy as np
>>> x = np.array([5, 4, 3, 2, 1])
>>> x[x>3]=3
>>> x
array([3, 3, 3, 2, 1])

You can also do this with Pandas:

>>> import pandas as pd
>>> x = pd.Series([5, 4, 3, 2, 1])
>>> x
0    5
1    4
2    3
3    2
4    1
dtype: int64
>>> x[x>3]=3
>>> x
0    3
1    3
2    3
3    2
4    1
dtype: int64
dawg
  • 98,345
  • 23
  • 131
  • 206
  • This actually solved my doubt, because I saw this syntax in pandas code and wasn't sure if it was due to python, or pandas. Any tip on how I could have known by looking at pandas code and/or documentation without the help of stackoverflow? – BeMyGuestPlease Sep 22 '19 at 08:12
  • My own answer, in case anyone else is interested: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html - Pandas and Numpy override the [] operator with __getitem__ as per https://stackoverflow.com/questions/1957780/how-to-override-the-operator-in-python – BeMyGuestPlease Sep 22 '19 at 08:23
5

You're using python lists. In python(2.x), comparison of a list with an int will compare the types, not the values. So, your comparison results in True which is equivalent to 1. In other words, your expression is equivalent to:

x[1] = 3  # x[1] == x[True] == x[x > 3]

Note, python3.x disallows this type of comparison (because it's almost certainly not what you meant) -- And if you want to be doing this sort of operation, you almost certainly thought of it by looking at numpy documentation as the numpy API has been designed specifically to support this sort of thing:

import numpy as np
array = np.arange(5)
array[array > 3] = 3
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • And note that numbers are special cased to always sort before other types. – Martijn Pieters Sep 21 '15 at 16:03
  • @MartijnPieters -- Well... That's CPython. IIRC, other implementations can define any sort order they want -- Just so long as it is consistent. – mgilson Sep 21 '15 at 16:04
  • Honestly prefer list comprehension to this. One less library to import :) – akalikin Sep 21 '15 at 16:05
  • @mgilson: sure, which could even mean they sort by `id(type(obj))`. But most implementations want to stay consistent with Python 2.x and do reimplement that behaviour. – Martijn Pieters Sep 21 '15 at 16:07
  • @akalikin -- That's fair. If this is the _only_ place in the code where you're going to do something like this, it's probably not worth the overhead of importing/learning numpy. However, it seems to me that it would be rare to have this sort of operation in the code only one time -- At which point using `numpy` can be a huge advantage. – mgilson Sep 21 '15 at 16:07
  • @akalikin: mgilson is explaining why the OP might have gotten confused and thought the standard Python list type might support the same kinds of operations. – Martijn Pieters Sep 21 '15 at 16:09