I have a text file. I want to get the lines that contain a file-name only if the file-name is a .doc or a .pdf type file.
For example,
<TR><TD ALIGN="RIGHT">4.</TD>
<TD ALIGN="LEFT" VALIGN="TOP" WIDTH=50%><a href="ABC.pdf"> On Complex Analytic Manifolds</a></TD>
<TD ALIGN="LEFT" VALIGN="TOP" WIDTH=72>L. Sam</TD>
</TR>
<TR><TD ALIGN="RIGHT">5.</TD>
<TD ALIGN="LEFT" VALIGN="TOP" WIDTH=50%><a href="DEF.doc"> On the Geometric theory of Fields</a>*</TD>
<TD ALIGN="LEFT" VALIGN="TOP" WIDTH=72>G.K. Ram</TD>
</TR>
using python re.findall()
I want to get the following lines.
<TD ALIGN="LEFT" VALIGN="TOP" WIDTH=50%><a href="ABC.pdf"> On Complex Analytic Manifolds</a></TD>
<TD ALIGN="LEFT" VALIGN="TOP" WIDTH=50%><a href="DEF.doc"> On the Geometric theory of Fields</a>*</TD>
Can any body please tell me any scalable way to define the pattern in the re.findall()?