3

I'm writing a script in to do some webscraping on my Firebase for a few select users. After accessing the events page for a user, I want to check for the condition that no events have been logged by that user first.

For this, I am using Selenium and Python. Using XPath seems to work fine for locating links and navigation in all other parts of the script, except for accessing elements in a table. At first, I thought I might have been using the wrong XPath expression, so I copied the path directly from Chrome's inspection window, but still no luck.

As an alternative, I have tried to copy the page source and pass it into Beautiful Soup, and then parse it there to check for the element. No luck there either.

Here's some of the code, and some of the HTML I'm trying to parse. Where am I going wrong?

# Using WebDriver - always triggers an exception
def check_if_user_has_any_data():
try:
    time.sleep(10)
    element = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@id="event-table"]/div/div/div[2]/mobile-table/md-whiteframe/div[1]/ga-no-data-table/div')))
    print(type(element))
    if element == True:
        print("Found empty state by copying XPath expression directly. It is a bit risky, but it seems to have worked")
    else:
        print("didn’t find empty state")
except:
    print("could not find the empty state element", EC)


# Using Beautiful Soup
def check_if_user_has_any_data#2():
    time.sleep(10)
    html = driver.execute_script("return document.documentElement.outerHTML")
    soup = BeautifulSoup(html, 'html.parser')
    print(soup.text[:500])
    print(len(soup.findAll('div', {"class": "table-row-no-data ng-scope"})))

HTML

<div class="table-row-no-data ng-scope" ng-if="::config" ng-class="{overlay: config.isBuilderOpen()}">
  <div class="no-data-content layout-align-center-center layout-row" layout="row" layout-align="center center">
    <!-- ... -->
</div>

The first version triggers the exception and is expected to evaluate 'element' as True. Actual, the element is not found.

The second version prints the first 500 characters (correctly, as far as I can tell), but it returns '0'. It is expected to return '1' after inspecting the page source.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ayman Bari
  • 31
  • 1
  • 2
  • What is the intent of `check_if_user_has_any_data#2():`? Does it work by accident? Or not. Python 2.7.17 and Python 3.6.9 complains: `SyntaxError: invalid syntax`. It doesn't appear to be the actual code. – Peter Mortensen Nov 14 '22 at 22:47
  • OK, we will never know. The OP has left the building: *"Last seen more than 3 years ago"* – Peter Mortensen Nov 14 '22 at 22:52

6 Answers6

1

Use the following code:

elements = driver.find_elements_by_xpath("//*[@id='event-table']/div/div/div[2]/mobile-table/md-whiteframe/div[1]/ga-no-data-table/div")
size = len(elements)
if len(elements) > 0:
    # Element is present. Do your action
else:
    # Element is not present. Do alternative action

Note: find_elements will not generate or throw any exception

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Pritam Maske
  • 2,670
  • 2
  • 22
  • 29
1

Here is the method that generally I use.

Imports

from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By

Method

def is_element_present(self, how, what):
    try:
        self.driver.find_element(by=how, value=what)
    except NoSuchElementException as e:
        return False
    return True
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
supputuri
  • 13,644
  • 2
  • 21
  • 39
  • 1
    @Pritom's answer is probably better because you want to avoid raising exceptions when you can. – pguardiario Mar 31 '19 at 23:28
  • @pguardiario Not to defend, but it's a choice – supputuri Apr 01 '19 at 02:30
  • It may be the *only* choice. From [a comment](https://stackoverflow.com/questions/30002313/selenium-finding-elements-by-class-name-in-python#comment128785684_30025430): *"`find_element_by_*` and `find_elements_by_*` are removed in Selenium 4.3.0. Use `find_element` instead."*. – Peter Mortensen Nov 14 '22 at 22:53
1

Some things load dynamically. It is better to just set a timeout on a wait exception.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Edo Edo
  • 164
  • 2
  • 9
1

If you're using Python and Selenium, you can use this:

try:
    driver.find_element_by_xpath("<Full XPath expression>") # Test the element if exist
    # <Other code>
except:
    # <Run these if element doesn't exist>
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
eunick
  • 11
  • 1
0

I've solved it. The page had a bunch of different iframe elements, and I didn't know that one had to switch between frames in Selenium to access those elements.

There was nothing wrong with the initial code, or the suggested solutions which also worked fine when I tested them.

Here's the code I used to test it:

# Time for the page to load
time.sleep(20)

# Find all iframes
iframes = driver.find_elements_by_tag_name("iframe")

# From inspecting page source, it looks like the index for the relevant iframe is [0]
x = len(iframes)
print("Found ", x, " iFrames") # Should return 5
driver.switch_to.frame(iframes[0])
print("switched to frame [0]")
if WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, '//*[@class="no-data-title ng-binding"]'))):
    print("Found it in this frame!")
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ayman Bari
  • 31
  • 1
  • 2
-1

Check the length of the element you are retrieving with an if statement,

Example:

element = ('https://www.example.com').
if len(element) > 1:
    # Do something.
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131