python web scraping none value issue

Question

I am trying to get the salary from this web_page but each time i got the same value "None"

however i tried to take different tags!

link_content = requests.get("https://wuzzuf.net/jobs/p/KxrcG1SmaBZB-Facility-Administrator-Majorel-Egypt-Alexandria-Egypt?o=1&l=sp&t=sj&a=search-v3")
soup = BeautifulSoup(link_content.text, 'html.parser')
salary = soup.find("span", {"class":"css-47jx3m"})
print(salary)

output:

None

Barry the Platipus · Answer 1 · 2022-11-24T18:42:16.607

Page is being generated dynamically with Javascript, so Requests cannot see it as you see it. Try disabling Javascript in your browser and hard reload the page, and you will see a lot of information missing. However, data exists in page in a script tag. One way of getting that information is by slicing that script tag, to get to the information you need [EDITED to account for different encoded keys - now it should work for any job]:

import requests
from bs4 import BeautifulSoup as bs
import json
import pandas as pd


pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
}

url = 'https://wuzzuf.net/jobs/p/KxrcG1SmaBZB-Facility-Administrator-Majorel-Egypt-Alexandria-Egypt?o=1&l=sp&t=sj&a=search-v3'

soup = bs(requests.get(url, headers=headers).text, 'html.parser')
salary = soup.select_one('script').text.split('Wuzzuf.initialStoreState = ')[1].split('Wuzzuf.serverRenderedURL = ')[0].rsplit(';', 1)[0]
data = json.loads(salary)['entities']['job']['collection']
enc_key = [x for x in data.keys()][0]
df = pd.json_normalize(data[enc_key]['attributes']['salary'])
print(df)

Result in terminal:

    min max currency    period  additionalDetails   isPaid
0   None    None    None    None    None    True

python web scraping none value issue

1 Answers1