I'm writing a program that has to download a bunch of files from the web before it can even run, so I created a function that will download all the files and "initialize" the program called init_program
, how it works is it runs through a couple dicts
that have urls to a gistfiles on github. It pulls the urls and uses urllib2
to download them. I won't be able to add all the files but you can try it out by cloning the repo here. Here's the function that will create the files from the gists:
def init_program():
""" Initialize the program and allow all the files to be downloaded
This will take awhile to process, but I'm working on the processing
speed """
downloaded_wordlists = [] # Used to count the amount of items downloaded
downloaded_rainbow_tables = []
print("\n")
banner("Initializing program and downloading files, this may take awhile..")
print("\n")
# INIT_FILE is a file that will contain "false" if the program is not initialized
# And "true" if the program is initialized
with open(INIT_FILE) as data:
if data.read() == "false":
for item in GIST_DICT_LINKS.keys():
sys.stdout.write("\rDownloading {} out of {} wordlists.. ".format(len(downloaded_wordlists) + 1,
len(GIST_DICT_LINKS.keys())))
sys.stdout.flush()
new_wordlist = open("dicts/included_dicts/wordlists/{}.txt".format(item), "a+")
# Download the wordlists and save them into a file
wordlist_data = urllib2.urlopen(GIST_DICT_LINKS[item])
new_wordlist.write(wordlist_data.read())
downloaded_wordlists.append(item + ".txt")
new_wordlist.close()
print("\n")
banner("Done with wordlists, moving to rainbow tables..")
print("\n")
for table in GIST_RAINBOW_LINKS.keys():
sys.stdout.write("\rDownloading {} out of {} rainbow tables".format(len(downloaded_rainbow_tables) + 1,
len(GIST_RAINBOW_LINKS.keys())))
new_rainbowtable = open("dicts/included_dicts/rainbow_tables/{}.rtc".format(table))
# Download the rainbow tables and save them into a file
rainbow_data = urllib2.urlopen(GIST_RAINBOW_LINKS[table])
new_rainbowtable.write(rainbow_data.read())
downloaded_rainbow_tables.append(table + ".rtc")
new_rainbowtable.close()
open(data, "w").write("true").close() # Will never be initialized again
else:
pass
return downloaded_wordlists, downloaded_rainbow_tables
This works, yes, however it's extremely slow, due to the size of the files, each file has at least 100,000 lines in it. How can I speed up this function to make it faster and more user friendly?