0

I am trying to loop over a file and find if an entry exists. I need to search through a date range and am not sure whether to loop over the lines in the file first and then each date, or to loop over the dates and then look in each line?

I have tried both options, but the code below seems more 'logical'. My question is how does one reason this out like a programmer? And why does the code below not try all single_dates, but only iterate once through all the lines in the file.

with open(r'reportLog.txt','r') as logFile:
    for single_date in daterange(start_date, end_date):
        for line in logFile:
            if all(var in line for var in (reportName, str(single_date), 'R')):
                print('found')
                break
            else:
                print('not found')

reportLog.txt:

Digital_Incomplete_Leads,2019-05-10,12:15:29,12:15:29,Y
Digital_Incomplete_Leads,2019-05-09,12:15:43,12:15:43,Y
Account Movement Report,2019-05-06,13:54:07,13:54:12,Y
Account Movement Report,2019-05-07,13:54:07,13:54:12,Y
Account Movement Report,2019-05-08,13:54:07,13:54:12,Y
Account Movement Report,2019-05-09,13:53:38,13:53:38,R
Account Movement Report,2019-05-09,13:54:07,13:54:12,Y

I want the code to loop over the text file and exit when it finds the following line:

Account Movement Report,2019-05-09,13:53:38,13:53:38,R
saranya
  • 45
  • 5
JMac
  • 5
  • 2
  • 1
    [Python : The second for loop is not running](https://stackoverflow.com/q/36572993/953482) answers the "why does the code [...] only iterate once through all the lines in the file" half of this question. TLDR: files can only be iterated over once unless you manually rewind them. (this may also partially answer the other half of your question -- a good reason to make your `for thing in file` loop the outermost one is so you don't have to rewind it later) – Kevin May 09 '19 at 13:26

3 Answers3

0

For speed the bottleneck will almost certainly be reading from the file and so that should be minimised.

You can minimise line reads using 'for line in logFile:' as the outside loop.

ziggy jones
  • 401
  • 2
  • 7
0

About which loop to use,The truth is that there is usually more than one way to solve a problem, and almost every programming problem that needs to make use of a loop could be solved with more than one type of loop. Given the concern of efficiency, there are pros and cons to each type. This article will help you decide. https://www.harrisgeospatial.com/Learn/Blogs/Blog-Details/ArtMID/10198/ArticleID/15332/What-Type-of-Loop-Should-I-Use

0

File:

Digital_Incomplete_Leads,2019-05-10,12:15:29,12:15:29,Y
Digital_Incomplete_Leads,2019-05-09,12:15:43,12:15:43,Y
Account Movement Report,2019-05-06,13:54:07,13:54:12,Y
Account Movement Report,2019-05-07,13:54:07,13:54:12,Y
Account Movement Report,2019-05-08,13:54:07,13:54:12,Y
Account Movement Report,2019-05-09,13:53:38,13:53:38,R
Account Movement Report,2019-05-09,13:54:07,13:54:12,Y

Code:

import csv
from datetime import datetime    

start_date = datetime.strptime("2019-05-06", "%Y-%m-%d")
end_date = datetime.strptime("2019-05-08", "%Y-%m-%d")
with open('main_data.csv') as f:
    csv_reader = csv.reader(f)
    for idx, line in enumerate(csv_reader):
        try:
            d = datetime.strptime(line[1], "%Y-%m-%d")
            if start_date <= d <= end_date:
                print(f"Found \"{line[1]}\" in {idx} row.")
                break
        except ValueError:
            print(f"Second column in {idx} row contain no date.")
        except IndexError:
            print(f"There's no second column in {idx} row.")
        except:
            print(f"Something unexpected in {idx} row.")
    else:
        print("Nothing have been found.")

Output:

Found "2019-05-06" in 2 row.

If you want to use daterange() which is some unknown (for me) method:

import csv


def daterange(start, end):
    # unknown method #
    pass


d_range = daterange("2019-05-06", "2019-05-08")
with open('main_data.csv') as f:
    csv_reader = csv.reader(f)
    for idx, line in enumerate(csv_reader):
        try:
            if line[1] in d_range:
                print(f"Found \"{line[1]}\" in {idx} row.")
                break
        except ValueError:
            print(f"Second column in {idx} row contain no date.")
        except IndexError:
            print(f"There's no second column in {idx} row.")
        except:
            print(f"Something unexpected in {idx} row.")
    else:
        print("Nothing have been found.")
Olvin Roght
  • 7,677
  • 2
  • 16
  • 35