Capture all data in tr tag within binance.com using Python Selenium

Question

I am unable to read all the data in the tbody tag on the Binance futures page using python selenium. I try to scrape this link: https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8

I used to command below:

tr = driver.find_elements(By.TAG_NAME,'tbody')

but there is no text output.

I'm trying to get all the data in the tr tags under the tbody tag in an array or an list object. I also need to know how many tr tag in the link.

undetected Selenium · Answer 1 · 2023-03-06T22:26:16.607

To get all the data in the <tr> tags within the <tbody> tag in a list object you need to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR and get_attribute("innerHTML"):

driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tbody.bn-table-tbody tr")))
for element in elements:
  print(element.get_attribute("textContent"))
driver.quit()

Using XPATH and text attribute:

driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody[@class='bn-table-tbody']//tr")))
for element in elements:
  print(element.get_attribute("textContent"))
driver.quit()

Console Output:

SOLUSDT Perpetual Short20x703621.952020.65609,118.79 (125.4862%)2023-03-03 19:31:49Trade
ETHUSDT Perpetual Short30x385.3831,562.541,568.19-2,176.72 (-10.8052%)2023-03-05 04:03:30Trade
EOSUSDT Perpetual Short20x138526.51.2721.2078,996.67 (107.6456%)2023-03-04 05:12:13Trade
COCOSUSDT Perpetual Short10x33878.52.2631201.58500022,973.69 (427.8359%)2023-03-03 06:50:52Trade
SSVUSDT Perpetual Short10x1010.344.25224938.0900006,225.72 (161.7813%)2023-03-03 20:05:15Trade

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

I am so sorry but I couldn't obtain the console output with above codes. After 20 sec waiting I am getting error: raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: — sefatheking, Mar 07 '23 at 08:02

score 1 · Accepted Answer · answered Mar 07 '23 at 12:40

For your task, you can use selenium + BeautifulSoup. Open the page in selenium, wait for the page to load, and then use the received data as a 'soup' object. First we find 'tbody', then we search for all 'tr' and for each 'tr' we find all 'td'. We extract the data and write it to the list. The first element is 'Symbol', the second is the total number of 'td' elements in the section, and then all the data from the table. Code:

from bs4 import BeautifulSoup
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options

url = f'https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8'

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/93.0.4577.82 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,"
              "application/signed-exchange;v=b3;q=0.9",
}


def get_result(url, headers):
    chrome_options = Options()
    options = webdriver.ChromeOptions()
    options.add_argument('headless')
    options.add_argument('--no-sandbox')
    driver = webdriver.Chrome(options=chrome_options, executable_path=".../chromedriver_linux64/chromedriver") # the path to your chromedriver
    driver.get(url)
    time.sleep(10)
    html = driver.page_source
    soup = BeautifulSoup(html, "lxml")
    tbody = soup.find('tbody', class_='bn-table-tbody')
    trs = tbody.find_all('tr')
    data = list()
    for tr in trs:
        tr_key=tr.get('data-row-key')
        if tr_key is None:
            pass
        else:
            mid_data = list()
            count=0
            mid_data.append(f'Symbol - {tr_key}')
            tds = tr.find_all('td')
            mid_data.append(f'td_count - {len(tds)}')
            for td in tds:
                count+=1
                mid_data.append(f'td_{count} - {td.text}')
            print(mid_data)
    

def main():
    get_result(url=url, headers=headers)


if __name__ == "__main__":
    main()

Will return:

['Symbol - SOLUSDT', 'td_count - 7', 'td_1 - SOLUSDT Perpetual Short20x', 'td_2 - 7036', 'td_3 - 21.9520', 'td_4 - 20.4050', 'td_5 - 10,884.64\xa0(151.6288%)', 'td_6 - 2023-03-03 17:01:49', 'td_7 - Trade']
['Symbol - ETHUSDT', 'td_count - 7', 'td_1 - ETHUSDT Perpetual Short30x', 'td_2 - 385.383', 'td_3 - 1,562.54', 'td_4 - 1,564.50', 'td_5 - -754.66\xa0(-3.7549%)', 'td_6 - 2023-03-05 01:33:30', 'td_7 - Trade']
['Symbol - EOSUSDT', 'td_count - 7', 'td_1 - EOSUSDT Perpetual Short20x', 'td_2 - 138526.5', 'td_3 - 1.272', 'td_4 - 1.175', 'td_5 - 13,383.85\xa0(164.4547%)', 'td_6 - 2023-03-04 02:42:13', 'td_7 - Trade']
['Symbol - COCOSUSDT', 'td_count - 7', 'td_1 - COCOSUSDT Perpetual Short10x', 'td_2 - 33878.5', 'td_3 - 2.263120', 'td_4 - 1.534000', 'td_5 - 24,701.49\xa0(475.3063%)', 'td_6 - 2023-03-03 04:20:52', 'td_7 - Trade']
['Symbol - SSVUSDT', 'td_count - 7', 'td_1 - SSVUSDT Perpetual Short10x', 'td_2 - 1010.3', 'td_3 - 44.252249', 'td_4 - 38.808423', 'td_5 - 5,499.90\xa0(140.2743%)', 'td_6 - 2023-03-03 17:35:15', 'td_7 - Trade']

You can process the final data as you like.

Capture all data in tr tag within binance.com using Python Selenium

2 Answers2