1

How to get these stats in a dataframe for multiple domains. ex : google.com , nyu.edu

expected_output :

name            expiry date                                   status 

google.com      Wednesday, 12 August 2020 at 17:30:48         active
nyu.edu         expired/No Cert                               expired/No Cert

tried this :

import ssl, socket
import pandas as pd

hostname =['google.com','nyu.edu']
datas=[]

for i in hostname:

    ctx = ssl.create_default_context()
    with ctx.wrap_socket(socket.socket(), server_hostname=i) as s:
        s.connect((i, 443))
        cert = s.getpeercert()
        issued_to = subject['commonName']
        notAfter=cert['notAfter']

        data=[{"name":issued_to},{"expiry":notAfter}]

        datas.append(data)


df = pd.DataFrame(datas) 

Tried this got this can't figure out which certs are expired and which are not.

1 Answers1

1

If you use getpeercert() you won't be able to retrieve the certificate in the outpout if the verification failed (you can test with https://expired.badssl.com/ for example).

From this answer, an alternative is to use getpeercert(True) to get he binary outpout of the certificate and parse it using OpenSSL.

In the following example, I've added an expiry column to check whether the certificate has expired by comparing with the current date :

import ssl, socket
import pandas as pd
from datetime import datetime
import OpenSSL.crypto as crypto

hostname = ['google.com','nyu.edu','expired.badssl.com']
datas = []
now = datetime.now()

for i in hostname:
    ctx = ssl.create_default_context()
    ctx.check_hostname = False
    ctx.verify_mode = ssl.CERT_NONE
    with ctx.wrap_socket(socket.socket(), server_hostname = i) as s:
        s.connect((i, 443))
        cert = s.getpeercert(True)
        x509 = crypto.load_certificate(crypto.FILETYPE_ASN1,cert)
        commonName = x509.get_subject().CN
        notAfter = datetime.strptime(x509.get_notAfter().decode('ascii'), '%Y%m%d%H%M%SZ')
        notBefore = datetime.strptime(x509.get_notBefore().decode('ascii'), '%Y%m%d%H%M%SZ')
        datas.append({
            "name": commonName, 
            "notAfter": notAfter,
            "notBefore": notBefore,
            "expired": (notAfter < now) or (notBefore > now)
        })

df = pd.DataFrame(datas) 
print(df)

Output :

           name            notAfter           notBefore  expired
0  *.google.com 2020-08-12 12:00:48 2020-05-20 12:00:48    False
1       nyu.edu 2021-01-06 23:59:59 2019-01-07 00:00:00    False
2  *.badssl.com 2015-04-12 23:59:59 2015-04-09 00:00:00     True

Note that I've used this post to convert timestamp formatted date to datetime

Also, if you want to test invalid certificates, you can check https://badssl.com/ for a list of testing domain which could arise specific verification error

Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159