0

I am trying to conditionally change the value of a current row, based on the value of the Nth row below it. say for example I have a csv file that looks like:

   trial
    ''
    ''
    ''
    ''
    ''
    ''
    ''
    'a'
    ''
    ''
    ''
    ''
    ''
    ''
    ''
    'a'
    ''
    ''
    ''
    ''
    ''
    ''
    ''
    'a'

Now,if the value of every 3rd row after the current row is 'a', then the null will be converted into 'a' from the current row to the 3rd row below. as such:

trial
''
''
''
''
'a'
'a'
'a'
'a'
''
''
''
''
'a'
'a'
'a'
'a'
''
''
''
''
'a'
'a'
'a'
'a'

my code is as follow:

data =csv.reader(data)
next(data)

def convert(param):
    if param=='':
        value='a'
    else:
        value=''
    return value

for row in data:
    i=0
    for line in islice(data, i+3, None):
        print i
        print line
        print row
        if line==['a']:
            convert(row)
        print row
        i = i+1

however, the output is:

0
[]
[]
[]
1
[]
[]
[]
2
[]
[]
[]
3
['a']
[]
[]
4
[]
[]
[]
5
[]
[]
[]
6
[]
[]
[]
7
[]
[]
[]
8
[]
[]
[]
9
[]
[]
[]
10
[]
[]
[]
11
[]
[]
[]
12
[]
[]
[]
13
['a']
[]
[]
14
[]
[]
[]
15
[]
[]
[]
16
[]
[]
[]
17
[]
[]
[]
18
[]
[]
[]
19
[]
[]
[]
20
[]
[]
[]
21
[]
[]
[]
22
[]
[]
[]
23
['a']
[]
[]

any idea on how to do this?

Gerard
  • 518
  • 4
  • 19
  • 1
    can you please make this more readable? Check https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – pythonic833 Mar 27 '18 at 15:25

2 Answers2

1

You want fillna with backfill

First, you need to make sure your empty values are actually null values recognized by pandas, so

import pandas as pd
import numpy as np

df = df.replace('', np.NaN).fillna(method='bfill', limit=3).replace(np.NaN, '')

   trial
0       
1       
2       
3       
4      a
5      a
6      a
7      a
8       
9       
10      
11      
12     a
13     a
14     a
15     a
16      
17      
18      
19      
20     a
21     a
22     a
23     a
ALollz
  • 57,915
  • 7
  • 66
  • 89
0

You can iterate over rows using iterrows() to achieve the result:

# if the values have quotes, you can remove the quotes first
df1['trial'] = df1['trial'].str.replace("'",'')

for index, row in df1.iterrows():
    if row['trial'] == 'a':
        df1.loc[index-3:index, 'trial'] = 'a'
    else:
        continue

# output

    trial
0   
1   
2   
3   
4   a
5   a
6   a
7   a
8   
9   
10  
11  
12  a
13  a
14  a
15  a
16  
17  
18  
19  
20  a
21  a
22  a
23  a
YOLO
  • 20,181
  • 5
  • 20
  • 40