How to access the html nested within multiple shadowRoot using Selenium and Python

Question

I am trying to build a bot to solve Wordle puzzles on the website (https://www.powerlanguage.co.uk/wordle/)

I am using selenium to enter a guess then attempting to inspect the page to see which guesses are correct and incorrect

I can see this information when I inspect the element on chrome but using selenium the html returned is much shorter and points to a javascript app? Is there a way to return the inspect html in selenium? Here is my code.

from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import ElementClickInterceptedException
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.Chrome(executable_path=r"/Users/1/Downloads/chromedriver", options=chrome_options)
driver.get("https://www.powerlanguage.co.uk/wordle/")
time.sleep(1)
sends=driver.find_element_by_xpath("/html/body")
sends.click()
sends.send_keys("adieu")
sends.send_keys(Keys.ENTER)
sends.get_attribute('innerHTML')

This is what inner html returns

And this is what I can see on inspection on the website

score 1 · Answer 1 · answered Jan 20 '22 at 07:22

If you're looking for a complete Python Selenium solution for solving the Wordle Game programmatically, here's one that uses the SeleniumBase framework. The solution comes with a YouTube video: Solving Wordle using SeleniumBase, as well as the Python code of the solution, and a GIF of what to expect:

The code uses special SeleniumBase ::shadow selectors in order to pierce through multiple layers of Shadow-DOM. Here's the code below, which can be run after calling pip install seleniumbase to get all the Python dependencies:

import ast
import random
import requests
from seleniumbase import __version__
from seleniumbase import BaseCase

class WordleTests(BaseCase):
    word_list = []

    def initalize_word_list(self):
        js_file = "https://www.powerlanguage.co.uk/wordle/main.e65ce0a5.js"
        req_text = requests.get(js_file).text
        start = req_text.find("var La=") + len("var La=")
        end = req_text.find("],", start) + 1
        word_string = req_text[start:end]
        self.word_list = ast.literal_eval(word_string)

    def modify_word_list(self, word, letter_status):
        new_word_list = []
        correct_letters = []
        present_letters = []
        for i in range(len(word)):
            if letter_status[i] == "correct":
                correct_letters.append(word[i])
                for w in self.word_list:
                    if w[i] == word[i]:
                        new_word_list.append(w)
                self.word_list = new_word_list
                new_word_list = []
        for i in range(len(word)):
            if letter_status[i] == "present":
                present_letters.append(word[i])
                for w in self.word_list:
                    if word[i] in w and word[i] != w[i]:
                        new_word_list.append(w)
                self.word_list = new_word_list
                new_word_list = []
        for i in range(len(word)):
            if (
                letter_status[i] == "absent"
                and word[i] not in correct_letters
                and word[i] not in present_letters
            ):
                for w in self.word_list:
                    if word[i] not in w:
                        new_word_list.append(w)
                self.word_list = new_word_list
                new_word_list = []

    def test_wordle(self):
        self.open("https://www.powerlanguage.co.uk/wordle/")
        self.click("game-app::shadow game-modal::shadow game-icon")
        self.initalize_word_list()
        keyboard_base = "game-app::shadow game-keyboard::shadow "
        word = random.choice(self.word_list)
        total_attempts = 0
        success = False
        for attempt in range(6):
            total_attempts += 1
            word = random.choice(self.word_list)
            letters = []
            for letter in word:
                letters.append(letter)
                button = 'button[data-key="%s"]' % letter
                self.click(keyboard_base + button)
            button = 'button[data-key="↵"]'
            self.click(keyboard_base + button)
            self.sleep(1)  # Time for the animation
            row = 'game-app::shadow game-row[letters="%s"]::shadow ' % word
            tile = row + "game-tile:nth-of-type(%s)"
            letter_status = []
            for i in range(1, 6):
                letter_eval = self.get_attribute(tile % str(i), "evaluation")
                letter_status.append(letter_eval)
            if letter_status.count("correct") == 5:
                success = True
                break
            self.word_list.remove(word)
            self.modify_word_list(word, letter_status)

        self.save_screenshot_to_logs()
        print('\nWord: "%s"\nAttempts: %s' % (word.upper(), total_attempts))
        if not success:
            self.fail("Unable to solve for the correct word in 6 attempts!")
        self.sleep(3)

This solution requires minimum SeleniumBase version 2.4.0 (or newer) due to updated Shadow-DOM methods. (Here are the Release Notes of that version.)

Note that SeleniumBase tests are run using pytest. Also, the Wordle website appears slightly differently when opened using headless Chrome, so don't use Chrome's headless mode when running this example.

score 0 · Accepted Answer · answered Jan 19 '22 at 16:17

The desired information interms of innerHTML is within multiple #shadow-root (open).

multiple_shadow_root

Solution

To extract the information you need to use shadowRoot.querySelectorAll() and you can use the following Locator Strategy:

Code Block:

driver.get("https://www.powerlanguage.co.uk/wordle/")
time.sleep(1)
sends=driver.find_element(By.XPATH, "/html/body")
sends.click()
sends.send_keys("adieu")
sends.send_keys(Keys.ENTER)
inner_texts = [my_elem.get_attribute("outerHTML") for my_elem in driver.execute_script("""return document.querySelector('game-app').shadowRoot.querySelector('game-row').shadowRoot.querySelectorAll('game-tile[letter]')""")]
for inner_text in inner_texts:
print(inner_text)

Console Output:

<game-tile letter="a" evaluation="absent" reveal=""></game-tile>
<game-tile letter="d" evaluation="absent"></game-tile>
<game-tile letter="i" evaluation="correct"></game-tile>
<game-tile letter="e" evaluation="absent"></game-tile>
<game-tile letter="u" evaluation="absent"></game-tile>

References

You can find a couple of relevant discussions in:

Fantastic that's pretty much exactly what I am looking for, what needs to be altered in the path to get the inner HTML for rows 2-6 of the grid when I send a guess? I tried sending a second guess before calling "inner_texts" and it just returned the same console output above. `sends.send_keys("adieu") sends.send_keys(Keys.ENTER) time.sleep(2) sends.send_keys("groan") sends.send_keys(Keys.ENTER) time.sleep(2)` — amc-man, Jan 19 '22 at 17:54
@amc-man That's subject to investigation :) Feel free to raise a new question as per your new requirement. — undetected Selenium, Jan 19 '22 at 17:57

How to access the html nested within multiple shadowRoot using Selenium and Python

2 Answers2

Solution

References