Below is some of the html I'm trying to scrape using python and selenium.
<h2 class ="page-title">
Strange Video Titles
<span class="duration">28 min</span>
<span class="video-hd-mark">720p</span>
</h2>
Below is my code:
title=driver.find_element_by_class_name('page-title').text
print(title)
However, when I run this, it prints everything within the h2 tag, including the text in the span classes. I've tried to adding [0] or [1] at the end to specify I only want the first line of text but that doesn't work. How can I only print the video title, which is located above the span classes?
Edit - I think this is the solution
So I've decided to do the following:
title=driver.find_element_by_class_name('page-title').text
duration = driver.find_element_by_xpath('/html/body/div/div[4]/h2/span[1]').text
vid_quality =driver.find_element_by_xpath('/html/body/div/div[4]/h2/span[2]').text
if (duration) in title:
title = title.replace(duration, "")
if(vid_quality) in title:
title = title.replace(vid_quality,"")
Thank you.