I took your code and simplified the structure and ran the test with minimal lines of code as follows:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.wsj.com/market-data/quotes/MET/financials/annual/income-statement')
print(driver.page_source)
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table.cr_dataTable tbody tr>td[class]")))])
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='cr_dataTable']//tbody//tr/td[@class]")))])
Similarly, as per your observation I have hit the same roadblock that my tests didn't yeild and results.
While inspecting the Page Source of the webpage it was observed that there is an EventListener within a <script>
which validates certain page metrics and some of them are:
window.utag_data
window.utag_data.page_performance
window.PerformanceTiming
window.PerformanceObserver
newrelic
first-contentful-paint
Page Source:
<script>
"use strict";
if (window.PerformanceTiming) {
window.addEventListener('DOMContentLoaded', function () {
if (window.utag_data && window.utag_data.page_performance) {
var dcl = 'DCL ' + parseInt(performance.timing.domContentLoadedEventStart - performance.timing.domLoading);
var pp = window.utag_data.page_performance.split('|');
pp[1] = dcl;
utag_data.page_performance = pp.join('|');
} else {
console.warn('No utag_data.page_performance available');
}
});
}
if (window.PerformanceTiming && window.PerformanceObserver) {
var observer = new PerformanceObserver(function (list) {
var entries = list.getEntries();
var _loop = function _loop(i) {
var entry = entries[i];
var metricName = entry.name;
var time = Math.round(entry.startTime + entry.duration);
if (typeof newrelic !== 'undefined') {
newrelic.setCustomAttribute(metricName, time);
}
if (entry.name === 'first-contentful-paint' && window.utag_data && window.utag_data.page_performance) {
var fcp = 'FCP ' + parseInt(entry.startTime);
var pp = utag_data.page_performance.split('|');
pp[0] = fcp;
utag_data.page_performance = pp.join('|');
} else {
window.addEventListener('DOMContentLoaded', function () {
if (window.utag_data && window.utag_data.page_performance) {
var _fcp = 'FCP ' + parseInt(entry.startTime);
var _pp = utag_data.page_performance.split('|');
_pp[0] = _fcp;
utag_data.page_performance = _pp.join('|');
} else {
console.warn('No utag_data.page_performance available');
}
});
}
};
for (var i = 0; i < entries.length; i++) {
_loop(i);
}
});
if (window.PerformancePaintTiming) {
observer.observe({
entryTypes: ['paint', 'mark', 'measure']
});
} else {
observer.observe({
entryTypes: ['mark', 'measure']
});
}
}
</script> <script>
if (window && typeof newrelic !== 'undefined') {
newrelic.setCustomAttribute('browserWidth', window.innerWidth);
}
</script> <title>MET | MetLife Inc. Annual Income Statement - WSJ</title> <link rel="canonical" href="https://www.wsj.com/market-data/quotes/MET/financials/annual/income-statement">
Conclusion
This is a clear indication that the website is protected by vigorous Bot Management techniques and the navigation by Selenium driven WebDriver initiated Browsing Context gets detected and subsequently blocked.
Reference
You can find a relevant discussions in: