0

I have a dataset with several columns, and I would like to iterate over every value in one specific column called "date", and update the value if the value meets a condition. This is what I have right now:

for element in df['date']:
    if element > 2000.0:
        element = element - 2400.0
    elif element < -2000.0:
        element = element + 2400.0

This obviously doesn't work, but what can I do to fix this?

Edit: Thanks for all the replies!

Adnos
  • 9
  • 1
  • 4
  • 1
    Does this answer your question? [How to select rows from a DataFrame based on column values](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) – ti7 Dec 05 '20 at 21:37
  • Can you post a sample dataframe to work with? – tdelaney Dec 05 '20 at 22:04

4 Answers4

0

If you really want to do it with a loop you can simply use the indices like arrays:

import pandas as pd

df = pd.DataFrame()
df['date'] = [-3000,3000,1000,5000]

for i in range(len(df['date'])):
    if df['date'][i] > 2000:
        df['date'][i] = df['date'][i] - 2400
    elif df['date'][i] < -2000:
        df['date'][i] = df['date'][i] + 2400

df

But I would use a simpler method using .loc:

df['date'].loc[df['date'] > 2000] = df['date'] - 2400
df['date'].loc[df['date'] <-2000] = df['date'] + 2400

df
Stefan
  • 897
  • 4
  • 13
0
    if element > 2000.0:
        df['date'] = element - 2400.0
    elif element < -2000.0:
        df['date'] = element + 2400.0
0

Use numpy and pandas

np.random.seed(4)
df = pd.DataFrame({'Orignal' : pd.Series(list(np.random.randint(1990,2010, 50)) + list(np.random.randint(-2010, -1990  , 50))).sample(15).to_list()})
df['Modefied'] = df.Orignal.apply(lambda x: x - 2400 if x > 2000 else x + 2400 if x < -2000 else x)

df:

    Orignal  Modefied
0      2008      -392
1     -1992     -1992
2      2004      -396
3      1996      1996
4     -2010       390
5      1995      1995
6     -2002       398
7     -2006       394
8     -1997     -1997
9     -2007       393
10    -2004       396
11     2008      -392
12    -2002       398
13     2006      -394
14    -2002       398
Amir saleem
  • 1,404
  • 1
  • 8
  • 11
0
import numpy as np   

df['date'] = np.select([df['date'] > 2000, df['date'] < -2000], 
                       [df['date'] - 2400, df['date'] + 2400], 
                       default=df['date'])
Arkadiusz
  • 1,835
  • 1
  • 8
  • 15