Introduce a check for or ignore NaN values

Question

I have a folder with a number of CSV files. Each file follows the same format with the header:

date,total_cost,total_pnl_pre,total_pnl_pos,total_pnl_per_pre,total_pnl_per_pos

A typical CSV file will look like:

date,total_cost,total_pnl_pre,total_pnl_pos,total_pnl_per_pre,total_pnl_per_pos
2015-07-27,-0.0,0.0,0.0,0.0,0.0
2015-07-28,-0.0,0.0,0.0,0.0,0.0
2015-07-29,-0.6738699251792465,0.0,-0.6738699251792465,-0.0,-0.027000000000000003
2015-07-30,-0.0,-123.88294424426506,-123.88294424426506,-4.961880089696313,-4.961880089696313
2015-07-31,-0.0,1.9275568497366795,1.9275568497366795,0.09627642044988116,0.09627642044988116

However there are some files where I have NaN values (see below)

date,total_cost,total_pnl_pre,total_pnl_pos,total_pnl_per_pre,total_pnl_per_pos
2015-07-27,-0.0,0.0,0.0,0.0,0.0
2015-07-28,-0.0,0.0,0.0,0.0,0.0
2015-07-29,NaN,NaN,NaN,0.0,0.0
2015-07-30,NaN,NaN,NaN,0.0,0.0
2015-07-31,NaN,NaN,NaN,0.0,0.0

I have two scripts hit_rate and max_drawdown that I use to process these files is:

def hit_rate(array_like):
    seq=np.array(array_like)
    seq=seq[np.nonzero(seq)]
    total_num=len(seq)
    if total_num==0: return -float('Inf')
    pos_num=len(seq[seq>0.0])
    neg_sum=total_num-pos_num
    if neg_sum==0: return float('inf')
    return pos_num/neg_sum

def max_drawdown(ser):
    running_max=pd.expanding_max(ser)
    cur_dd=ser-running_max
    return min(0,cur_dd.min())

The CSV file is read into the script in the variable array_like and ser The scripts falls over when it encounters a NaN value. Is there a way to either set the NaN values to zero or ignore the NaN values when processing the CSV file?

Are you using pandas, the csv module or anything else to read the csv files? — Serge Ballesta, Jul 31 '18 at 09:42
@SergeBallesta - But no problem if OP need something else, then question should be reopened. — jezrael, Jul 31 '18 at 09:46

Introduce a check for or ignore NaN values

0 Answers0