0

Problem: I have a list of files in a directory, and I want to retrieve the first file that matches all three substrings criteria. I based my case 1 solution from this example to find the first item that maches a criteria from find first sequence item that matches a criterion

Issue: However, I found that if I extended the number of if-checks to three in the generator expression, then I get a python: Python/compile.c:3437: stackdepth_walk: Assertion `depth >= 0' failed. Abort (core dumped)

Question: That is peculiar to me, as I am only testing for a few conditions, it doesn't seem like something that should cause a stack assertion. Can someone explain why this happens?

Case 1 belows reproduces the error

Case 2 shows that this method still works with two if checks

Case 3 shows that this method will work if I break up the list comprehension, and and the next call.

Case 4 is an alternate solution for the same check but as a regex, and that works.

#! /usr/bin/python
import re

files_in_dir = ['dp2_syouo_2013-05-16_0000.csv', 
                'dp1_torishima_2013-05-21_0000.csv', 
                'dp2_torishima_2013-05-22_0000.csv', 
                'dp1_hirokawa_2013-05-21_0000.csv', 
                'dp2_hirokawa_2013-05-22_0000.csv', 
                'dp2_syouo_2013-05-17_0000.csv', 
                'dp2_syouo_2013-05-18_0000.csv']

dp_string = "dp2"
date_string = "2013-05-22"
location_string = "torishima"

# case 1: Three filter checks, stackdepth_walk: Assertion
#python: Python/compile.c:3437: stackdepth_walk: Assertion `depth >= 0' failed.
# Abort (core dumped)
file_matched_1 = next( (file_in_dir for file_in_dir 
                        in files_in_dir 
                        if dp_string in file_in_dir                                             
                        if location_string in file_in_dir
                        if date_string in file_in_dir), None)
print "case 1: " + file_matched_1;


# case 2: Two filter checks, works fine
file_matched_2 = next( (file_in_dir for file_in_dir 
                        in files_in_dir 
                        if dp_string in file_in_dir                                             
                        if location_string in file_in_dir
                        ), None)
print "case 2: " + file_matched_2

# case 3: Generate the list first with three filters, then get the first item
files_matched_3 = [file_in_dir for file_in_dir 
                        in files_in_dir 
                        if dp_string in file_in_dir                                             
                        if location_string in file_in_dir
                        if date_string in file_in_dir]
file_matched_3 = next(iter(files_matched_3))
print "case 3: " + file_matched_3

# case 4: Put the three checks into a regex
date_location_regex = r'' + dp_string + '*.' + location_string + '*.' + date_string
file_matched_4 = next( (file_in_dir for file_in_dir 
                        in files_in_dir 
                        if re.search(date_location_regex, file_in_dir)), None)
print "case 4: " + file_matched_4
Community
  • 1
  • 1
frank
  • 1,283
  • 1
  • 19
  • 39
  • What is your Python's Version and OS? I just ran your snippet with Py 2.7 on Win 7 and it works fine. – Abhijit Sep 26 '13 at 07:30
  • @Abhijit I am running python 2.6.6, an answer below has pointed out the known bug on this version. – frank Sep 27 '13 at 00:42

1 Answers1

1

You are using too many if statements. It never happened to me but I googling I saw some posts about it, saying (and you corroborated this with your 2 other tests) you need to reduce the number of if statements you use.

Frankly, I can't understand why you're doing it this way. The better way is to assemble the filename from those three substrings.

dp_string = "dp2"
date_string = "2013-05-22"
location_string = "torishima"
file_string = '{0}_{1}_{2}_0000.csv'.format(dp_string, location_string, date_string)

file_matched_1 = next( (file_in_dir for file_in_dir 
                        in files_in_dir 
                        if file_string in file_in_dir
                       ), None)
print "case 1: " + file_matched_1;

EDIT

Looks like a known bug for python 2.6.6 with centOS? (Link)

Ofir Israel
  • 3,785
  • 2
  • 15
  • 13
  • @OfirIsarael I agree that a stricter match would be easier, but I wanted to try out a flexible match in case the surrounding string content and order changes. Thanks for pointing out the known bug. – frank Sep 27 '13 at 00:43