0

This is a follow up from the previous question I had. Basically, I'm trying to scrape all CRD# from the search result from this site https://brokercheck.finra.org/search/genericsearch/list

(You'll need to redo the search when you click on the link, just type some random stuff for the Individual search)

I'm trying to scrape the Disclosure to get Yes or No, but this box uses an ng-if to display Yes or No, or for some rows it won't even display it.

enter image description here

I'm using the CSS_SELECTOR to select the text of that div. However, the content of the ng-if is different

# No
print([disclosure.get_attribute("innerHTML") for disclosure in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-scope[ng-if='!vm.item.hasDisclosures() && vm.item.hasDisclosureFlag()'")))])

# Yes
print([disclosure.get_attribute("innerHTML") for disclosure in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-scope[ng-if='vm.item.hasDisclosures()'")))])

How do I programmatically handle this case?

Thanks.

PTN
  • 1,658
  • 5
  • 24
  • 54

2 Answers2

1

What you want in such cases is Xpath which are dependent on your text.

first you want to get the Disclosures div and then the div next to it

//div[@class='flipper']//md-card-content/div/div//*[contains(., "Disclosures")]

And then to get the next div

//div[@class='flipper']//md-card-content/div/div//*[contains(., "Disclosures")]/following-sibling::div[1]/div

Then you can get the div. So basically when you are looping through each of the flipper card you should run something find element as .//md-card-content/div/div//*[contains(., "Disclosures")] /following-sibling::div[1]/div to find the value of disclosure in the card. If the element is not there, that means no disclosure was specified

Tarun Lalwani
  • 142,312
  • 9
  • 204
  • 265
1

To print all the Disclosure values i.e. Yes or No from the search results within the website https://brokercheck.finra.org/search/genericsearch/grid using Selenium you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and hasDisclosures():

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-scope[ng-if*='hasDisclosures']")))])
    
  • Using CSS_SELECTOR and hasDisclosureFlag():

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-scope[ng-if*='hasDisclosureFlag']")))])
    
  • Using CSS_SELECTOR, hasDisclosures() and hasDisclosureFlag():

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.ng-scope[ng-if*='hasDisclosures'][ng-if*='hasDisclosureFlag']")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352