0

My program could possibly take up to four hours to run through all of the satellite data that I am trying to read. I have ensured that the code works correctly when I read in a full year of data and save only a few areas worth of data. However, when I try to run my function with multiple years of data, it eventually kills the iPython kernel towards the end of the loop saying "The kernel has died. Please restart the kernel". Here is my function (MODIS), which calls the 2 smaller functions, below that the program keeps on dying in:

from mpl_toolkits.basemap import Basemap
import numpy as np
from pyhdf.SD import SD, SDC

def int_to_bin(data):
int_bin = []
x = 0
stop = len(data)
while (x!=stop):
    binn = bin(data[x])[2:]
    integer = int(binn)
    int_bin.append(integer)
    x = x + 1

flags = np.array(int_bin)
stop = len(flags)
x = 0
while (x!=stop):
    if (flags[x]==100000 or flags[x]==101000 or flags[x]==110000 or flags[x]==1000001 or flags[x]==1001001 or flags[x]==1010001):
        flags[x] = 0
    x = x + 1

return flags

def map():

m = Basemap(projection='cyl',llcrnrlat=-90,urcrnrlat=90,\
        llcrnrlon=-180,urcrnrlon=180,resolution='c')


m.drawmapboundary(fill_color='white', zorder=-1)
m.fillcontinents(color='0.8', lake_color='white', zorder=0)

m.drawcoastlines(color='0.6', linewidth=0.5)
m.drawcountries(color='0.6', linewidth=0.5)
m.drawparallels(np.arange(-90.,91.,30.), labels=[1,0,0,1], dashes=[1,1], linewidth=0.25, color='0.5')
m.drawmeridians(np.arange(-180.,181., 60.), labels=[1,0,0,1], dashes=[1,1], linewidth=0.25, color='0.5')

return m



def MODIS(final,filenameyear,fileyear):
    m = map()
    harddrive = '/Volumes/Jasons_EXT/'
    files = harddrive + filenameyear
    file = open(files,mode='r')

    try:

        filenames = file.readlines()
        file.close()
    except OSError:
        filenames = file.readlines()
        file.close()

    final_lon = []
    final_lat = []
    final_ET = []

    loop = 0

    while(loop != final):
        if ( 30<= loop <= 60 or 316<= loop <= 346 or 602<= loop <= 632 or 888<= loop <= 918 or 1174<= loop <= 1204 or 1460<= loop <= 1490 or 1746<= loop <= 1776 or 2032<= loop <= 2062 or 2318<= loop <= 2348 or 2604<= loop <= 2634 or 2890<= loop <= 2920 or 3176<= loop <= 3206 or 3462<= loop <= 3492 or 3748<= loop <= 3778 or 4034<= loop <= 4064 or 4320<= loop <= 4350 or 4606<= loop <= 4636 or 4892<= loop <= 4922 or 5178<= loop <= 5208 or 5464<= loop <= 5494 or 5750<= loop <= 5780 or 6036<= loop <= 6066 or 6322<= loop <= 6352):
            filename = filenames[loop].rstrip('\n')
            final_file = harddrive + fileyear + filename
            hdf = SD(final_file, SDC.READ)
            attribute = hdf.attr(3)
            bounds = attribute.get()
            k = bounds.find('NORTHBOUNDINGCOORDINATE')
            try:   
                start = k + 83
                end = k + 100
### bounds.find('NORTHBOUNDINGCOORDINATE')#### add start,end with 83,100
                nstring = bounds[start:end]
                north = float(nstring)
            except ValueError:
                start = k + 83
                end = k + 87
                nstring = bounds[start:end]
                north = float(nstring)       
### bounds.find('SOUTHBOUNDINGCOORDINATE')#### add start,end with 83,100
            k = bounds.find('SOUTHBOUNDINGCOORDINATE')
            start = k + 83
            end = k + 100
            sstring = bounds[start:end]
            south = float(sstring)
### bounds.find('EASTBOUNDINGCOORDINATE')#### add start,end with 83,100 
            k = bounds.find('EASTBOUNDINGCOORDINATE')
            start = k + 83
            end = k + 100
            estring = bounds[start:end]
            east = float(estring)
### bounds.find('WESTBOUNDINGCOORDINATE')#### add start,end with 83,99
            k = bounds.find('WESTBOUNDINGCOORDINATE')

            try:
                start = k + 83
                end = k + 99
                wstring = bounds[start:end]
                west = float(wstring)
            except ValueError:
                start = k + 83
                end = k + 102
                wstring = bounds[start:end]
                west = float(wstring)



            i = 0
            stop = 1200
            longitude = []


            while(i != stop):
                lon = west + (i*(east-west)/stop)
                longitude.append(lon)
                i = i + 1

            j = 0
            stop = 1200
            latitude = []


            while(j != stop):
                lat = south + (j*(north-south)/stop)
                latitude.append(lat)
                j = j + 1

            lat = np.array(latitude)
            lon = np.array(longitude)

            lon,lat = m(*np.meshgrid(lon,lat))

            lon = lon.ravel()
            lat = lat.ravel()

            final_lat.append(lat)
            final_lon.append(lon)




            DATAFIELD_NAME= 'ET_1km'
            data = hdf.select(DATAFIELD_NAME)    ### units are kg/(m^2*8days)
            ET = data[:,:]
            ET_sf = data.scale_factor    
            ET = ET * ET_sf

            ET = ET.ravel()


            DATAFIELD_NAME= 'ET_QC_1km'
            data = hdf.select(DATAFIELD_NAME)    ### units are kg/(m^2*8days)
            ET_QC = data[:,:]

            ET_QC = ET_QC.ravel()
            QC_flags = int_to_bin(ET_QC)

            ET[np.where(QC_flags > 0)] = np.nan


            final_ET.append(ET)




        loop = loop + 1

     return final_lon, final_lat, final_ET

The amount of data I am running through is 15-20 GBs per year of data (for 7 years in all). Is my program crashing because it cannot handle the amount of computations and RAM my program requires? It would be great to know why the console would crash in the middle/end of my program routines.

Jason
  • 181
  • 2
  • 14
  • (*Spyder dev here*) Question: could you try to run your program in an IPython or Jupyter qtconsole and see what happens? You can start one by running `ipython qtconsole` or (if that doesn't work) `jupyter qtconsole`. The thing is Spyder is using an extra layer of communication with IPython kernels, and that could be causing them crash. – Carlos Cordoba Nov 25 '15 at 14:00
  • So here is what I found out with my code. I tried your suggestion and it still crashed in the Jupyter qtconsole and when I ran it through my bash terminal. However, when I just copied the code and repeated it line by line instead of calling my MODIS function, the program worked just fine. Is there a case in which Python crashes because the computations are so extensive and long within a function call? – Jason Nov 26 '15 at 15:34
  • I am very interested in my internal error because the function ran through one year of data when I called it the first time, but the second time I called it, the program crashes towards the end of reading all the data files and performing the computations. – Jason Nov 26 '15 at 15:35
  • I think you should optimize how you're reading your file. I mean, instead of using `filenames = file.readlines()`, you should use something like what's mentioned in [this answer](http://stackoverflow.com/a/2111801/438386) or in this [blog post](http://www.peterbe.com/plog/blogitem-040312-1) – Carlos Cordoba Nov 26 '15 at 21:26

0 Answers0