1

I have listed two sets of codes - one that does not work (first one) and the other one that does (second one). Question is how can I make the first code work using the SSL cert, since it does not work without verification. I have a SSL cert that I use to access the web page. I exported to as .pfx (that is the only option). How would I use the cert credentials to access the main page (first code). This would immensely help with my progress! Thanks.

If I hit the main page (where the XMLS links are embedded) with request, I am getting a blank soup (soup.title is blank and so other soup functions). This is the code:

from bs4 import BeautifulSoup
import requests

url = 'https://www.oasis.oati.com/cgi-bin/webplus.dll?script=/woa/woa-planned-outages-report.html&Provider=MISO'
response = requests.get(url, verify=False)
soup = BeautifulSoup(response.content, "html.parser")

However, if I hit the page directly (with no verification) with the specific XML link, I am able to retrieve it using this code:

import requests
import xml.etree.ElementTree as ET
import pandas as pd

url = 'https://www.oasis.oati.com/woa/docs/MISO/MISODocs/CurrentData/2308_Planned_Outages_2017-09-19-18-50-00.xml'
response = requests.get(url, verify=False)
root=ET.fromstring(response.text)

all_records = [] #This is our record list which we will convert into a dataframe
for i, child in enumerate(root): #Begin looping through our root tree
    record = {} #Place holder for our record
    for subchild in child: #iterate through the subchildren to user-agent, Ex: ID, String, Description
        record[subchild.tag] = subchild.text #Extract the text create a new dictionary key, value pair
        all_records.append(record) #Append this record to all_records

df = pd.DataFrame(all_records).drop_duplicates().reset_index(drop=True)
Shyama Sonti
  • 321
  • 1
  • 5
  • 16

1 Answers1

1

You should use cert option in request.get call:

requests.get(url, cert=('/path/client.cert', '/path/client.key'))

Besides you should extract your certificate from pfx archive (read this for example)

Roman Mindlin
  • 852
  • 1
  • 8
  • 12