Get content of HTML Header with Beautifulsoup

Question

I'm creating an bot that should retrieves me the status of an order.

I started with this:

import requests
from bs4 import BeautifulSoup

nextline = '\n'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
url = "https://footlocker.narvar.com/footlocker/tracking/de-mail?order_number=31900491219XXXXXXX"

def getstatus(url):
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'html.parser')
    for EachPart in soup.select('div[class="tracking-status-container status_reposition"]'):
        print(EachPart)


getstatus(url)

But still after several tries, "EachPart" is empty.

Then I noticed that the Information I want / need is not in the HTML-Body, it is in the Header. So if I just print the soup I receive :

<head>
 var translation = {"comm_sms_auto_response_msg":"........... "widgets_tracking_status_justshipped_status":"Ready to go"  }
 var xxxxxx
 var xxxxxx
 var xxxxxx
</head>
<body>
..................
</body>

In the "var translation", there is "widgets_tracking_status_justshipped_status":"Ready to go"

And thats what i need to extractm the "widgets_tracking_status_justshipped_status" and the text of the field, so "Ready to go".

Hello Nex! This should be helpful [link](https://stackoverflow.com/questions/53823388/python-how-can-i-scrape-with-bs4-a-javascript-code) — Ice Bear, Dec 21 '20 at 16:58

score 0 · Answer 1 · answered Dec 22 '20 at 02:46

0

for Javascript string use Regex

def getstatus(url):
    response = requests.get(url, headers=headers)
    status = re.search(r'_justshipped_status":"([^"]+)', response.text).group(1)
    print(status)
    # Ready to go

answered Dec 22 '20 at 02:46

uingtea

6,002
2
26
40

Get content of HTML Header with Beautifulsoup

1 Answers1