0

This may be a simple question, but I am having trouble finding a solution. I have a variable named T_wall with a pandas Series containing numbers. When that value is over 2,000, I would like the T_wall to output 2,000.

I have tried an if statement but I continue to get errors. Any ideas? Thanks!

import pandas as pd
T_wall = pd.Series([1999.0, 2000.0, 2001.0, 2002.0, 2003.0])

if T_wall > 2000.0:
    T_wall = 2000.0

I am getting the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

de1
  • 2,986
  • 1
  • 15
  • 32
  • what error you getting ? It should work – Moinuddin Quadri Jan 05 '21 at 21:54
  • I am getting this error:ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). –  Jan 05 '21 at 21:55
  • 1
    So `T_wall` is a pandas Series. – Jan Christoph Terasa Jan 05 '21 at 21:57
  • 2
    It looks like `T_wall` is not a int or float. Please post a full example that can be copied into a python interpreter and reproduces the error. I googled the error message and the first hit points to stackoverflow: [https://stackoverflow.com/questions/36921951/truth-value-of-a-series-is-ambiguous-use-a-empty-a-bool-a-item-a-any-o] – Jens Jan 05 '21 at 21:57
  • The error message you quote [in the above comment](https://stackoverflow.com/questions/65587124/using-a-conditional-statement-to-change-the-value-of-a-variable#comment115960418_65587124) tells you _exactly_ what the problem is, and _exactly_ how to fix it. Be sure you include the message in your question itself, going forward; also, a few words about how you understand that message and how you tried to apply its advice would help folks understand where you're coming from and how to better tailor answers to that perspective. – Charles Duffy Jan 05 '21 at 21:59
  • Please post a working example. You say you have a variable with an assigned value... actually assign that variable a value in the question. Your error message (which should be in the question itself) hints that this is a pandas series. Is that the type you expect? What do you want the output to be? A series with values > 2000.0 set to 2000? All values in the series set to 2000? T_wall replaced with a single float if any values are > 2000? – tdelaney Jan 05 '21 at 22:32

4 Answers4

1

Taking everything from the comments. Avoid using for in pandas as much as possible. Here you can go with masking:

T_wall[T_wall > 2000] = 2000

apply would also work

T_wall.apply(lambda x: 2000 if x > 2000 else x)
Voodu
  • 770
  • 7
  • 18
0

T_wall is a pandas series, so instead you can loop through the series and then do your if statement:

for i in range(0, len(T_wall)):
    if T_wall[i] > 2000.0:
        T_wall[i] = 2000.0
nsh1998
  • 106
  • 9
0

To complete the question with an example.

>>> import pandas as pd
>>> ser = pd.Series([1999, 2000, 2001, 2002, 2003])
>>> ser
0    1999
1    2000
2    2001
3    2002
4    2003
dtype: int64

Meaning of ser > 2000

>>> ser > 2000
0    False
1    False
2     True
3     True
4     True
dtype: bool

As you can see ser > 2000 returns a series itself, with True or False values, depending on whether the condition matched.

There are several ways to then use that condition.

The mask function

mask can accept the condition and returns a new Series that "replaces" the values with the provided value (the original series won't change unless you set inplace). (See also Mask User Guide section)

>>> ser.mask(ser > 2000, 2000)
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

That is somewhat equivalent to:

>>> [(2000 if x > 2000 else x) for x in ser]
[1999, 2000, 2000, 2000, 2000]

The where function

where is the inverse of mask, therefore you'd want to invert the condition to achieve the same effect. Here the second argument is other, providing the replacement value where the condition is False. (See also Where User Guide section)

>>> ser.where(ser <= 2000, 2000)
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

That is somewhat equivalent to:

>>> [(x if x <= 2000 else 2000) for x in ser]
[1999, 2000, 2000, 2000, 2000]

assignment via boolean indexing

You can also change the series directly via boolean indexing as indicated in other answers (adding for completeness):

>>> ser
0    1999
1    2000
2    2001
3    2002
4    2003
dtype: int64
>>> ser[ser > 2000] = 2000
>>> ser
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

(That would then be equivalent to ser.mask(ser > 2000, 2000, inplace=True))

The apply function

You could also use apply (also with an optional inplace parameter):

>>> ser = pd.Series([1999, 2000, 2001, 2002, 2003])
>>> ser.apply(lambda x: 2000 if x > 2000 else x)
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

That allows you to use a regular Python function or expression. But it won't be as efficient for large series as the other examples, as it will call the Python expression for each value rather than doing everything within Pandas (vectorized).

Similar questions

de1
  • 2,986
  • 1
  • 15
  • 32
0

The most pandas-way answer would be:

T_wall[T_wall > 2000.0] = 2000.0

Example:

data = pandas.Series([1,2,3,4,5])
data[data > 2] = 5

data:

0    1
1    2
2    5
3    5
4    5

Pandas docs: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#the-where-method-and-masking

Beniamin H
  • 2,048
  • 1
  • 11
  • 17