2

I am unable to read all the data in the tbody tag on the Binance futures page using python selenium. I try to scrape this link: https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8

I used to command below:

tr = driver.find_elements(By.TAG_NAME,'tbody')

but there is no text output.

I'm trying to get all the data in the tr tags under the tbody tag in an array or an list object. I also need to know how many tr tag in the link.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

2 Answers2

1

To get all the data in the <tr> tags within the <tbody> tag in a list object you need to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute("innerHTML"):

    driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
    elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tbody.bn-table-tbody tr")))
    for element in elements:
      print(element.get_attribute("textContent"))
    driver.quit()
    
  • Using XPATH and text attribute:

    driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
    elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody[@class='bn-table-tbody']//tr")))
    for element in elements:
      print(element.get_attribute("textContent"))
    driver.quit()
    
  • Console Output:

    SOLUSDT Perpetual Short20x703621.952020.65609,118.79 (125.4862%)2023-03-03 19:31:49Trade
    ETHUSDT Perpetual Short30x385.3831,562.541,568.19-2,176.72 (-10.8052%)2023-03-05 04:03:30Trade
    EOSUSDT Perpetual Short20x138526.51.2721.2078,996.67 (107.6456%)2023-03-04 05:12:13Trade
    COCOSUSDT Perpetual Short10x33878.52.2631201.58500022,973.69 (427.8359%)2023-03-03 06:50:52Trade
    SSVUSDT Perpetual Short10x1010.344.25224938.0900006,225.72 (161.7813%)2023-03-03 20:05:15Trade
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
1

For your task, you can use selenium + BeautifulSoup. Open the page in selenium, wait for the page to load, and then use the received data as a 'soup' object. First we find 'tbody', then we search for all 'tr' and for each 'tr' we find all 'td'. We extract the data and write it to the list. The first element is 'Symbol', the second is the total number of 'td' elements in the section, and then all the data from the table. Code:

from bs4 import BeautifulSoup
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options

url = f'https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8'

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/93.0.4577.82 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,"
              "application/signed-exchange;v=b3;q=0.9",
}


def get_result(url, headers):
    chrome_options = Options()
    options = webdriver.ChromeOptions()
    options.add_argument('headless')
    options.add_argument('--no-sandbox')
    driver = webdriver.Chrome(options=chrome_options, executable_path=".../chromedriver_linux64/chromedriver") # the path to your chromedriver
    driver.get(url)
    time.sleep(10)
    html = driver.page_source
    soup = BeautifulSoup(html, "lxml")
    tbody = soup.find('tbody', class_='bn-table-tbody')
    trs = tbody.find_all('tr')
    data = list()
    for tr in trs:
        tr_key=tr.get('data-row-key')
        if tr_key is None:
            pass
        else:
            mid_data = list()
            count=0
            mid_data.append(f'Symbol - {tr_key}')
            tds = tr.find_all('td')
            mid_data.append(f'td_count - {len(tds)}')
            for td in tds:
                count+=1
                mid_data.append(f'td_{count} - {td.text}')
            print(mid_data)
    

def main():
    get_result(url=url, headers=headers)


if __name__ == "__main__":
    main()

Will return:

['Symbol - SOLUSDT', 'td_count - 7', 'td_1 - SOLUSDT Perpetual Short20x', 'td_2 - 7036', 'td_3 - 21.9520', 'td_4 - 20.4050', 'td_5 - 10,884.64\xa0(151.6288%)', 'td_6 - 2023-03-03 17:01:49', 'td_7 - Trade']
['Symbol - ETHUSDT', 'td_count - 7', 'td_1 - ETHUSDT Perpetual Short30x', 'td_2 - 385.383', 'td_3 - 1,562.54', 'td_4 - 1,564.50', 'td_5 - -754.66\xa0(-3.7549%)', 'td_6 - 2023-03-05 01:33:30', 'td_7 - Trade']
['Symbol - EOSUSDT', 'td_count - 7', 'td_1 - EOSUSDT Perpetual Short20x', 'td_2 - 138526.5', 'td_3 - 1.272', 'td_4 - 1.175', 'td_5 - 13,383.85\xa0(164.4547%)', 'td_6 - 2023-03-04 02:42:13', 'td_7 - Trade']
['Symbol - COCOSUSDT', 'td_count - 7', 'td_1 - COCOSUSDT Perpetual Short10x', 'td_2 - 33878.5', 'td_3 - 2.263120', 'td_4 - 1.534000', 'td_5 - 24,701.49\xa0(475.3063%)', 'td_6 - 2023-03-03 04:20:52', 'td_7 - Trade']
['Symbol - SSVUSDT', 'td_count - 7', 'td_1 - SSVUSDT Perpetual Short10x', 'td_2 - 1010.3', 'td_3 - 44.252249', 'td_4 - 38.808423', 'td_5 - 5,499.90\xa0(140.2743%)', 'td_6 - 2023-03-03 17:35:15', 'td_7 - Trade']

You can process the final data as you like.

user510170
  • 286
  • 2
  • 5