Not able to scrape the details of the card using selenium, beautifulsoup and python

Question

Actual Problem :

Link to scrape : https://www.axisbank.com/retail/cards/credit-card/axis-bank-ace-credit-card/features-benefits#menuTab

Items that I want from that link are there in the following images :

I wrote the following code :

from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re
from selenium import webdriver

s = 'https://www.axisbank.com/retail/cards/credit-card/axis-bank-ace-credit-card/features-benefits#menuTab'
driver = webdriver.Chrome(executable_path="C:\\Users\\Hari\\Downloads\\chromedriver.exe")
driver.get(s)
soup = BeautifulSoup(driver.page_source, 'lxml')
# print(x.find('h3').get_text())
det = []
a = soup.find('div', class_ = 'owl-stage')
for x in a.find_all('div', class_ = 'owl-item'):
    print(x.find('li').get_text())
driver.close()

I tried the above code but got stuck after I got this output

Output

Traceback (most recent call last):
  File "C:\Users\Hari\PycharmProjects\Card_Prj\buffer.py", line 22, in <module>
    print(x.find('li').get_text())
AttributeError: 'NoneType' object has no attribute 'get_text'

I don't know how to proceed further and scrape the information that I want, any help is highly appreciated.

HedgeHog · Answer 1 · 2021-02-18T18:20:02.020

1

EDIT

As mentioned in the comments expected output is another and should be added to the question. Anyway, to get your goal, extract the heading and description like that:

for x in soup.select('div.owl-stage div.owl-item'):
    heading = x.h3.get_text(strip=True)
    description = x.select_one('h3 + div').get_text(strip=True)
    det.append(heading+':'+description)

edited Feb 18 '21 at 18:20

answered Feb 18 '21 at 16:24

HedgeHog

22,146
4
14
36

I appreciate the code you wrote, but I got one issue the output is a list of all the 1 st points but I want the output in a different way as I specified in the question already. (Overview of the problem) eg : ["heading : description", "heading : description",.........] – Feb 18 '21 at 16:34
EDITED my answer, but you should improve your questions, they are not clear enough. – HedgeHog Feb 18 '21 at 17:14
@ HedgeHog I have seen the edited answer, but still I have not got the expected output. I kindly request you to see the Overview of the problem section to get more clarity about this question. – Feb 18 '21 at 17:24
@ HedgeHog I'd be really glad if you could edit your existing answer, because I am not able to understand the code in the comments section – Feb 18 '21 at 18:06

score 0 · Accepted Answer · answered Feb 18 '21 at 22:22

To extract the visible texts within the Features and Benefits section using Selenium and python you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR:

driver.get("https://www.axisbank.com/retail/cards/credit-card/axis-bank-ace-credit-card/features-benefits#menuTab")
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.owl-item.active div.contentBox")))])

Using XPATH:

driver.get("https://www.axisbank.com/retail/cards/credit-card/axis-bank-ace-credit-card/features-benefits#menuTab")
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='owl-item active']//div[@class='contentBox']")))])

Console Output:

['Launch offer\n5% cashback on Big Basket and Grofers\nValid till 28th February 2021\nFor detailed terms and conditions, click here', 'Unlimited Cashback on every spend\n5% cashback on bill payments (electricity, internet, gas and more) DTH and mobile recharges on Google Pay\n4% on Swiggy, Zomato & Ola\n2% on all other spends\nNo upper limit on cashback\n\nRead More', 'Lounge Access\nEnjoy 4 complimentary lounge visits per calendar year at select domestic airports with your ACE Credit Card. For list of airports and detailed terms and conditions, click here']

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Not able to scrape the details of the card using selenium, beautifulsoup and python

2 Answers2

EDIT