I have the following html
<html>
<body>
<p style="text-align:center;margin-bottom:0pt;margin-top:0pt;text-indent:0%;font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">
<a name="_marker_1"></a>
<a name="bananabread"></a>
<font style="font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">
<a name="bananabread"></a>Ban</font> <font style="font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">ana Bread</font>
</p>
<p style="text-align:center;margin-top:10pt;margin-bottom:0pt;text-indent:0%;font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">The Best You Ever Tasted</p>
<p style="margin-top:24pt;margin-bottom:0pt;text-indent:7.69%;font-style:italic;font-family:Times New Roman;font-size:10pt;font-weight:normal;text-transform:none;font-variant: normal;">If you don't agree that this is the best banana bread you have ever eaten well I would suggest you see your doctor</p>
<p style="margin-top:10pt;margin-bottom:0pt;text-indent:7.69%;font-family:Times New Roman;font-size:10pt;font-weight:normal;font-style:normal;text-transform:none;font-variant: normal;">Lots of text here describing what I am trying to capture</p>
<p style="text-align:center;margin-bottom:0pt;margin-top:0pt;text-indent:0%;font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">
<a name="_marker_2"></a>
<a name="bananapudding"></a>
<font style="font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">
<a name="bananapudding"></a>Banana</font>
<font style="font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">Pudding</font>
</p>
<p style="text-align:center;margin-top:10pt;margin-bottom:0pt;text-indent:0%;font-weight:bold;font-family:Times New Roman;font-size:10pt;font-style:normal;text-transform:none;font-variant: normal;">Creamy and Satisfying</p>
<p style="margin-top:24pt;margin-bottom:0pt;text-indent:7.69%;font-style:italic;font-family:Times New Roman;font-size:10pt;font-weight:normal;text-transform:none;font-variant: normal;">This is the same recipe your mother used when you were ten!</p>
<p style="margin-top:10pt;margin-bottom:0pt;text-indent:7.69%;font-family:Times New Roman;font-size:10pt;font-weight:normal;font-style:normal;text-transform:none;font-variant: normal;">Lots of text here describing what I am trying to capture</p>
</body>
</html>
I am trying to write an xpath expression to identify Banana Bread - my initial efforts were successful -
b_tree.xpath('.//*[starts-with(text(),"Banana Bread")]')
but I notice the error cases and upon investigation they are like the html above - another element is added inside the content I am searching for. Sometimes it is like above, a possibly unneeded font element, sometimes it is an anchor.
I worked with this answer (Related) but have not been successful
I can check for elements that have text_content() - clean up the text_content and then string match to my ultimate goal but I am hoping to learn to better apply xpath to these types of problems.
To be absolutely clear I need the text_content of the p element. But sometimes I just need the text of a font element. My existing XPATH expression works fine on the cases where there is not an intervening element. I do not know when I open the page the structure that was imposed on the document.