0

I have a python script below that attempts to make a connection to a sql db. The database was having some connection issues and I want the script to try to reconnect to the db so I added try / except. However the script still does not automatically reconnect with the try / except statement. The connection to the db gets killed after 149 queries and the script doesn't try to reconnect based on what I am seeing on the screen. If anyone can help me with getting my script to reconnect to the db automatically I would greatly appreciate it!

from __future__ import print_function

try:
    import psycopg2
except ImportError:
    raise ImportError('\n\033[33mpsycopg2 library missing. pip install psycopg2\033[1;m\n')
    sys.exit(1)

import re
import sys
import json
import pprint
import time

outfilepath = "crtsh_output/crtsh_flat_file"

DB_HOST = 'crt.sh'
DB_NAME = 'certwatch'
DB_USER = 'guest'

DELAY = 5

#conn = psycopg2.connect("dbname={0} user={1} host={2}".format(DB_NAME, DB_USER, DB_HOST))
#cursor = conn.cursor()

def connect_to_db(tries=0):
    filepath = 'forager.txt'
#    conn = psycopg2.connect("dbname={0} user={1} host={2}".format(DB_NAME, DB_USER, DB_HOST))
#    cursor = conn.cursor()
    with open(filepath) as fp:
        unique_domains = ''
        try:
            conn = psycopg2.connect("dbname={0} user={1} host={2}".format(DB_NAME, DB_USER, DB_HOST))
            cursor = conn.cursor()
            for cnt, domain_name in enumerate(fp):
                print("Line {}: {}".format(cnt, domain_name))
                print(domain_name)
                domain_name = domain_name.rstrip()

                cursor.execute('''SELECT c.id, x509_commonName(c.certificate), x509_issuerName(c.certificate), x509_notBefore(c.certificate), x509_notAfter(c.certificate), x509_issuerName(c.certificate)
                FROM certificate c, certificate_identity ci WHERE
                c.id= ci.certificate_id AND ci.name_type = 'dNSName' AND lower(ci.name_value) =
                lower(%s) AND x509_notAfter(c.certificate) > statement_timestamp()''', (domain_name,))


                unique_domains = cursor.fetchall()

                pprint.pprint(unique_domains)

                outfilepath = "crtsh1" + ".json"
                with open(outfilepath, 'a') as outfile:
                        outfile.write(json.dumps(unique_domains, sort_keys=True, indent=4, default=str, ensure_ascii = False))
                time.sleep(DELAY)

        except Exception as error:
        #    print("\n\033[1;31m[!] Unable to connect to the database\n\033[1;m")
            if tries < 3:
                time.sleep(1) # give the DB a bit to recover if you want
                connect_to_db(tries+1)
            else:
                raise error

if __name__ == "__main__":
    connect_to_db()
bedford
  • 181
  • 1
  • 13
  • What exactly **do** you see on the screen? Seems to be doing pretty much what you want from what I see, but I had to make some minor changes due to the lack of an available PostgreSQL ... – shmee Aug 07 '18 at 12:55
  • @shmee Thank you for your response! I get the output on the screen I want and then after it completes query 149 I get a message that says Killed and the script dies. Do you know how I could implement it to do 100 queries and then take a short break, and then continue working? – bedford Aug 07 '18 at 13:04
  • You surely could close and reopen your cursor and connection during your iteration over the file (e.g. `if not cnt % 100 and cnt`), but I thought your question was concerned with your `try-except` construct not working as you expect it to (i.e. not reconnecting three times), which I think it does. And the connection being killed reproducably after 149 queries seems a bit too constistent to be caused by an unsatbale connection tbh. So at least the error information/traceback from your screen might help troubleshooting ... – shmee Aug 07 '18 at 13:12
  • @shmee Here is my error message. Line 148: 1stand2ndmortgage.com 1stand2ndmortgage.com [] Line 149: 1xirsp15.com 1xirsp15.com Killed – bedford Aug 07 '18 at 13:22
  • "Killed"? Is this running on a Linux by chance? Because _that_ very much looks like the kernel killing the process, presumably due to resource starvation. – shmee Aug 07 '18 at 13:33
  • Yes, it is in Linux – bedford Aug 07 '18 at 13:36
  • @shmee That is why I want to reconnect to the db to continue the process each time. Is that possible? – bedford Aug 07 '18 at 13:37
  • In that case your `try-except` can't help, because it is not the DB connection that gets closed, your whole script gets killed. There's nothing you can do inside your script to recover from that. Does the query for line 149 return a particularly big result set? Can you check the memory consumption of your script over time (e.g. using `top`on another console)? Does the memory consumption steadily rise or does it jump significantly when executing the Line 149 query?. Maybe adding `conn.rollback()` after your `time.sleep(DELAY)` already helps. – shmee Aug 07 '18 at 13:44
  • @shmee Line 149 returns no result so it's surprising that it kills the script. – bedford Aug 07 '18 at 13:49
  • @shmee After using top the script gets up to 75% memory utilization with the process running at Line 149 query. I actually went and queried the database myself and saw that the Line 149 item is going to return a lot of results. Is there any work around? – bedford Aug 07 '18 at 14:07
  • [Server side cursors](http://initd.org/psycopg/docs/usage.html#server-side-cursors) as outlined by brav in his answer to [this question](https://stackoverflow.com/questions/28343240/psycopg2-uses-up-memory-on-large-select-query) might help with that. – shmee Aug 07 '18 at 14:49

0 Answers0