Suppose I only need the lines in a txt file that contain two names such as:
<td >Jacob</td> <td>273,844</td> <td >Emily</td> <td>223,690</td></tr>
And the txt file contains the text below:
<tr >
<th style="text-align:right; background-color:white; color:black" scope="col">Rank</th>
<th style="text-align:right; background-color:white; color:black" scope="col" abbr="male name">Name</th>
<th style="text-align:right; background-color:white; color:black" scope="col" abbr="male number">Number</th>
<th style="text-align:right; background-color:white; color:black" scope="col" abbr="female name">Name</th>
<th style="text-align:right; background-color:white; color:black" abbr="female number">Number</th>
</tr>
</thead>
<tbody>
<tr ><td>1</td>
<td >Jacob</td> <td>273,844</td> <td >Emily</td> <td>223,690</td></tr>
<tr ><td>2</td>
<td >Michael</td> <td>250,554</td> <td >Madison</td> <td>193,152</td></tr>
<tr ><td>3</td>
<td >Joshua</td> <td>231,926</td> <td >Emma</td> <td>181,257</td></tr>
<tr ><td>4</td>
<td >Matthew</td> <td>221,513</td> <td >Olivia</td> <td>156,000</td></tr>
<tr ><td>5</td>
Using the regex "^<td\s*>([a-zA-Z]+)<\/td\s*>.*<td\s*>([a-zA-Z]+)<\/td\s*>.*"
how do I extract the names only using re.findall to compile a list?
Thank you in advance.