I tried to construct my own string.find()
method/function in Python. I did this for a computer science class I'm in.
Basically, this program opens a text file, gets a user input on this the text they want to search for in the file, and outputs the line number on which the string resides, or outputs a 'not found' if the string doesn't exist in the file.
However, this takes about 34 seconds to complete 250,000 lines of XML.
Where is the bottleneck in my code? I made this in C# and C++ as well, and this runs in about 0.3 seconds for 4.5 million lines. I also performed this same search using the built-in string.find()
from Python, and this takes around 4 seconds for 250,000 lines of XML. So, I'm trying to understand why my version is so slow.
https://github.com/zach323/Python/blob/master/XML_Finder.py
fhand = open('C:\\Users\\User\\filename')
import time
str = input('Enter string you would like to locate: ') #string to be located in file
start = time.time()
delta_time = 0
def find(str):
time.sleep(0.01)
found_str ='' #initialize placeholder for found string
next_index = 0 #index for comparison checking
line_count = 1
for line in fhand: #each line in file
line_count = line_count +1
for letter in line: #each letter in line
if letter == str[next_index]: #compare current letter index to beginning index of string you want to find
found_str += letter #if a match, concatenate to string placeholder
#print(found_str) #print for visualization of inline search per iteration
next_index = next_index + 1
if found_str == str: #if complete match is found, break out of loop.
print('Result is: ', found_str, ' on line %s '%(line_count))
print (line)
return found_str #return string to function caller
break
else:
#if a match was found but the next_index match was False, reset the indexes and try again.
next_index=0 # reset indext back to zero
found_str = '' #reset string back to empty
if found_str == str:
print(line)
if str != "":
result = find(str)
delta_time = time.time() - start
print(result)
print('Seconds elapsed: ', delta_time)
else:
print('sorry, empty string')