We prepare a following python scripts (python 2.7) to make histograms.
histogram.py
#!/usr/bin/env python
import sys
import numpy as np
import matplotlib as mpl
import matplotlib.mlab as mlab
mpl.use('Agg')
import matplotlib.pyplot as plt
sys.argv[1] # Define input name
sys.argv[2] # Define output name
sys.argv[3] # Define title
# Open the file name called "input_file"
input_file=sys.argv[1]
inp = open (input_file,"r")
lines = inp.readlines()
if len(lines) >= 20:
x = []
#numpoints = []
for line in lines:
# if int(line) > -10000: # Activate this line if you would like to filter any date (filter out values smaller than -10000 here)
x.append(float(line))
# the histogram of the data
n, bins, patches = plt.hist(x, 50, normed=False, facecolor='gray')
plt.xlabel('Differences')
numpoints = len(lines)
plt.ylabel('Frequency ( n =' + str(numpoints) + ' ) ' )
title=sys.argv[3]
plt.title(title)
plt.grid(True)
save_file=sys.argv[2]
plt.savefig(save_file+".png")
plt.clf()
inp.close()
example: input
1
2
3
The script will do the following
python histogram.py input ${output_file_name}.png ${title_name}
We add a line "if len(lines) >= 20:" so if the data points are less than 20, we don't make a plot.
However, if the file is empty, this python script will be freeze.
We add a bash line to remove any empty files before running "python histogram.py input ${output_file_name}.png ${title_name}"
find . -size 0 -delete
For some reasons, this line always works in small scale testings but not in real production runs under several loops. So we would love to make the "histogram.py" ignore any empty files if possible.
The search only finds this link which doesn't seem to be quite helpful : (
Ignoring empty files from coverage report
Could anyone kindly offer some comments? Thanks!