1

Below there is some html that I can extract Text with Selenium driver.

<td colspan="2"><strong>Owner</strong>
                <div ng-class="{'owner-overflow' : property.ownersInfo.length > 4}">
                    <!-- ngRepeat: owner in property.ownersInfo --><div ng-repeat="owner in property.ownersInfo" class="ng-scope">
                        <div class="ng-binding">ERROL P BROWN LLC 
                            <!-- &nbsp;&nbsp; <span ng-if="owner.shortDescription != null && owner.shortDescription.length > 0">({{owner.shortDescription}})</span> -->
                        </div>
                    </div><!-- end ngRepeat: owner in property.ownersInfo -->
                </div>
            </td>    

<td colspan="2" class="pi_mailing_address"><strong>Mailing Address</strong>
                <div>
                    <span class="ng-binding">1784 NE 163 ST </span>
                    <span ng-show="property.mailingAddress.address2" class="ng-binding ng-hide"></span>
                    <span ng-show="property.mailingAddress.address3" class="ng-binding ng-hide"></span>
                    <span ng-show="property.mailingAddress.city" ng-class="{'inline':property.mailingAddress.city}" class="ng-binding inline">NORTH MIAMI,</span>
                    <span class="inline ng-binding">FL</span>
                    <span class="inline ng-binding">33162</span>
                    <span ng-hide="isCountryUSA(property.mailingAddress.country)" class="ng-binding ng-hide">USA</span>
                </div>
            </td>

When I run the code manually all the fields get picked up no issue. How ever if I run the script in a loop to extract this data These elements are blank. I am collecting other fields as well they are not coming up blank. There is no error in processing. Its just that when I save the data to a database these values are coming up empty. Is there some work around to have this NOT happen?

These are the lines of code:

Owner = driver.find_element(By.XPATH, "//strong[text()='Owner']//following::div[1]").text
SubDivision = driver.find_element(By.XPATH, "//strong[text()='Sub-Division:']//following::div[1]").text
Address1 = driver.find_element(By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[1]").text
Address2 = driver.find_element(By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[2]").text
Address3 = driver.find_element(By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[3]").text
city = driver.find_element(By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[4]").text.replace(",", "")
state = driver.find_element(By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[5]").text
zipcode = driver.find_element(By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[6]").text
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Leo Torres
  • 673
  • 1
  • 6
  • 18
  • You must have made a mistake, somewhere, in the code you didn't show us... – John Gordon Jul 13 '23 at 00:57
  • John Like I said the code works manually and data is inserted into database only when done manually. IF ran by the script other fields are stored to DB just fine but these are not. There is something else going on here that I am not aware of. – Leo Torres Jul 13 '23 at 01:01
  • You have to show us the ACTUAL CODE that isn't working. Telling us "I ran this code in a loop" isn't enough. – John Gordon Jul 13 '23 at 01:03
  • Also, your code is looking for `//strong`, but I don't see any `` tags in that html at all... – John Gordon Jul 13 '23 at 01:05
  • Again I am not having issue pulling the data its Security mechanism in place. that does not allow me to collect data. If I run it manually it works. I added the strong tags but i didnt think it was relevant for me example. – Leo Torres Jul 13 '23 at 01:34
  • Here is the site https://www.miamidade.gov/Apps/PA/propertysearch/#/ Here is a sample folio number 28-2210-013-1320 – Leo Torres Jul 13 '23 at 01:35
  • I said I can't help unless you show the actual code. You still haven't done that. So, goodbye and good luck. – John Gordon Jul 13 '23 at 01:49
  • John I dont have a code issue. The question is not about code the code works. My question is about Selenium bot not being able to extract fields due to masking or security. Like I said several times in the comments and in the question the code "works" when I run it manually. I pasted this code as per your request. But all that was needed is the html and the site. – Leo Torres Jul 13 '23 at 11:00

1 Answers1

2

Incase you are able to extract the required texts in standalone execution but not in a loop that may be due to race conditions that occur between the browser and the user's instructions.


Solution

As the elements are angular elements, so to extract the texts ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use the following locator strategies:

Owner = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Sub-Division:']//following::div[1]"))).text
SubDivision = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Sub-Division:']//following::div[1]"))).text
Address1 = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[1]"))).text
Address2 = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[2]"))).text
Address3 = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[3]"))).text
city = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[4]"))).text.replace(",", "")
state = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[5]"))).text
zipcode = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//strong[text()='Mailing Address']//following::div[1]//following::span[6]"))).text

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352