Using regex to parse html. This works as desired at regex101 but fails in python. I'm using re.findall(regex, text) in python to run the search.
regex = (?:Airplane|Rotorcraft) Life Cycle(.*?)>(Accident Threats|Accident Threat Categories)<.*>(Operations|Groupings|Industries)<.*?<td(.*?)<\/td>.*?<td(.*?)<\/td>.*(Accident Common Themes).*?<td(.*?)<\/td>
text =
\n<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">\n\n<!-- V 3.0.2 -->\n<html>\n<head>\n<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>\n<title>Lessons Learned</title>\n<meta content="Federal Aviation Administration" name="author"/>\n<meta content="This Lessons Learned From Transport Airplane Accidents Library is intended to provide information in order to aid in the continual improvement of the safety of commercial air travel." name="description"/>\n<meta content="Boeing 737,Boeing 737-300,rudder,rudder reversal,USAir,USAir 427,United Airlines,United Airlines Flight 585,Pittsburgh,Aliquippa,wake turbulance,NTSB,National Transportation Safety Board,FAA,Federal Aviation Administration,737 Critical Design Review,737 Engineering Test and Evaluation Board," name="keywords"/>\n<link href="styles/master.css" media="screen,projection" rel="stylesheet" type="text/css"/>\n<!--[if lte IE 6]>\r\n\t<link href="styles/ie6.css" rel="stylesheet" type="text/css" media="screen,projection">\r\n\t<![endif]-->\n<!--[if IE 7]>\r\n\t<link href="styles/ie7.css" rel="stylesheet" type="text/css" media="screen,projection">\r\n\t<![endif]-->\n<!-- User Styles -->\n<!-- Print Instructions -->\n<link href="styles/print.css" media="print" rel="stylesheet" type="text/css"/>\n<!-- FAVICON -->\n<link href="images/layout/favicon.ico" rel="shortcut icon" type="image/ico"/>\n<link href="styles/ll_styles.css" rel="stylesheet" type="text/css"/>\n<script type="text/javascript">\r\n\tfunction MM_jumpMenu(targ,selObj,restore){ //v3.0\r\n \teval(targ+".location=\'"+selObj.options[selObj.selectedIndex].value+"\'");\r\n\tif (restore) selObj.selectedIndex=0;\r\n\t}\r\n</script>\n</head>\n<body>\n<ul class="hidden" id="accessibility">\n<li><a href="#mainContent">Skip to page content</a></li>\n</ul>\n<div id="container">\n<div id="head">\n<a class="logo" href="http://www.faa.gov/" title="FAA Homepage"><img alt="Federal Aviation Administration" height="75" src="images/layout/logoPrint.png" width="209"/></a>\n<div id="headlink">\n<a href="index.cfm">Lessons Learned From Civil Aviation Accidents Home</a>\n</div>\n</div>\n<div class="content" id="mainContent">\n<!-- USER CODE BEGINS -->\n<!-- End Header --><!-- USER CODE BEGINS -->\n<!-- USER CODE BEGINS -->\n<div id="hNav">\n<ul class="l1">\n<li><a href="index.cfm">Home</a></li>\n<li><a href="general.cfm">Small Airplane</a></li>\n<li class="here"><a href="transport.cfm">Transport Airplane</a></li>\n<li><a href="rotorcraft.cfm">Rotorcraft</a></li>\n</ul>\n<ul class="l2">\n<li><a href="transport.cfm">Library Home</a></li>\n<li class="here"><a href="ll_main.cfm?TabID=1">View All Accidents</a></li>\n<li><a href="ll_main.cfm?TabID=2">Airplane Life Cycle</a></li>\n<li><a href="ll_main.cfm?TabID=3">Threat Categories / Groupings</a></li>\n<li><a href="ll_main.cfm?TabID=4">Common Themes</a></li>\n<li><a href="ll_main.cfm?TabID=5">Searching/Sorting</a></li>\n<li><a href="ll_site_map.cfm">Site Map</a></li>\n</ul></div>\n<div id="vNav">\n<ul>\n<li>\n<ul>\n<li class="isLB"><a class="here" href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=0">Boeing 747-400 BCF<br/>National Airlines Flight 102, N949CA</a>\n<ul>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=2">Accident Overview</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=3">Accident Board Findings</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=4">Accident Board Recommendations</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=5">Relevant Regulations / Policy / Background</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=6">Prevailing Cultural / Organizational Factors</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=7">Key Safety Issue(s)</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=8">Safety Assumptions</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=9">Precursors</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=10">Resulting Safety Initiatives</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=11">Airworthiness Directives (ADs) Issued</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=13">Common Themes</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=14">Related Accidents / Incidents</a></li>\n<li><a href="ll_main.cfm?TabID=1&LLID=86&LLTypeID=12"><b>Lessons Learned</b></a></li>\n</ul>\n</li>\n</ul>\n</li>\n</ul>\n</div>\n<div id="vNavContent">\n<div class="imgRight"><img alt="Photo of accident airplane" height="203" src="../../../National102/National102_overview.jpg" title="Photo of accident airplane" width="300"><br>Photo of accident airplane<br/>Photo copyright Luc Van Belleghem - used with permission\r\n\t\t\t\t\t\t\t\t\t<br/>\n<div class="imgNorm_Per">\n<table border="0" cellpadding="0" cellspacing="0" summary="Search results based on life cycle, categories or themes that were selected. There are two levels of row headings" width="100%">\n<caption>Accident Perspectives:</caption>\n<thead>\n<tr>\n<th colspan="2" valign="top">Airplane Life Cycle</th>\n</tr>\n</thead>\n<tr>\n<td colspan="2" valign="top"><ul>\n<li>Operational<br/></li></ul>\n</td>\n</tr>\n<thead> <tr>\n<th valign="top">Accident Threat Categories</th><th valign="top">Groupings</th>\n</tr>\n</thead> <tr>\n<td valign="top">\n<ul>\n<li>Cabin Safety / Hazardous Cargo</li>\n<li>Structural Failure</li>\n<li>In-flight Upsets</li></ul></td>\n<td valign="top">\n<ul><li>Loss of Control</li></ul>\n</td>\n</tr>\n<thead>\n<tr>\n<th colspan="2" valign="top">Accident Common Themes</th>\n</tr>\n</thead> <tr>\n<td colspan="2" valign="top">\n<ul>\n<li>Organizational Lapses</li><li>Human Error</li><li>Flawed Assumptions</li><li>Pre-existing Failures</li>\n</ul>\n</td>\n</tr>\n</table>\n</div>\n</br></img></div>\n<h1>Boeing 747-400 BCF<br/>National Airlines Flight 102, N949CA</h1>\n<h2>Bagram, Afghanistan</h2>\n<h2>April 29, 2013</h2>\n<p>National Airlines Flight 102, a Boeing 747-400 BCF (Boeing Converted Freighter), was a scheduled cargo flight from Bagram Air Base, Bagram, Afghanistan to Dubai, United Arab Emirates on April 29, 2013. During takeoff, the airplane immediately climbed steeply then descended in a manner consistent with an aerodynamic stall and crashed. All seven crewmembers - the captain, first officer, loadmaster, augmented captain and first officer, and two mechanics were killed. The airplane was destroyed by impact and a post-crash fire ensued.</p><p>The National Transportation Safety Board (NTSB) determined that the probable cause of this accident was National Airlinesâ\x80\x99 inadequate procedures for restraining special cargo loads. This resulted in the loadmasterâ\x80\x99s improper restraint of the cargo, which moved aft and damaged hydraulic systems Nos. 1 and 2 and horizontal stabilizer drive mechanism components, rendering the airplane uncontrollable. The inadequate procedures: 1) failed to include required, safety-critical restraint information from the airplane manufacturer (Boeing) and the manufacturer of the main deck cargo handling system (Telair) and 2) contained incorrect and unsafe methods for restraining special cargo. Investigators also concluded that a factor contributing to the accident was the failure of the Federal Aviation Administration (FAA) to adequately oversee National Airlines\' handling of special cargo loads.</p>\n</div>\n<!-- USER CODE ENDS -->\n</div>\n<div id="footer">\n<div class="address">\n<p><a href="http://www.dot.gov/" title="Department of Transportation">U.S.\xa0Department\xa0of\xa0Transportation</a><br/>\r\n\t\t\t\tFederal\xa0Aviation\xa0Administration<br/>\r\n\t\t\t\t800\xa0Independence\xa0Avenue,\xa0SW<br/>\r\n\t\t\t\tWashington,\xa0DC\xa020591<br/>\r\n\t\t\t\t1-866-TELL-FAA\xa0(1-866-835-5322)</p>\n</div>\n<div class="midSection divide">\n<p class="title">Readers & Viewers</p>\n<ul class="readersViewers">\n<li class="pdf"><a href="http://www.faa.gov/viewer_redirect.cfm?server_name=www.faa.gov&viewer=pdf" title="PDF">PDF</a></li>\n<li class="ppt"><a href="http://www.faa.gov/viewer_redirect.cfm?server_name=www.faa.gov&viewer=ppt" title="Powerpoint">Powerpoint</a></li>\n<li class="zip"><a href="http://www.faa.gov/viewer_redirect.cfm?server_name=www.faa.gov&viewer=zip" title="Zip">Zip</a></li>\n<li class="doc"><a href="http://www.faa.gov/viewer_redirect.cfm?server_name=www.faa.gov&viewer=doc" title="Word">Word</a></li>\n<li class="xls"><a href="http://www.faa.gov/viewer_redirect.cfm?server_name=www.faa.gov&viewer=xls" title="Excel">Excel</a></li>\n</ul>\n<p class="title">Web Policies</p>\n<ul>\n<li><a href="http://www.faa.gov/web_policies/">Web Policies & Notices</a></li>\n<li><a href="http://www.faa.gov/privacy/">Privacy Policy</a></li>\n<li><a href="https://www.faa.gov/web_policies/vulnerability_disclosure_policy/">Vulnerability Disclosure Policy</a></li>\n</ul>\n</div>\n<div class="midSection divide">\n<p class="title">Government Sites</p>\n<ul>\n<li><a href="http://www.dot.gov/">DOT.gov</a></li>\n<li><a href="http://www.usa.gov/">USA.gov</a></li>\n<li><a href="http://www.plainlanguage.gov/">Plainlanguage.gov</a></li>\n<li><a href="http://www.regulations.gov/">Regulations.gov</a></li>\n<li><a href="http://www.data.gov/">Data.gov</a></li>\n</ul>\n</div>\n<div class="left divide">\n<p class="title">Contact Us</p>\n<ul>\n<a href="ll_contact.cfm">Contact Information</a>\n</ul>\n</div>\n</div>\n</div>\n<!-- Load JS Libraries -->\n<script src="javascript/aria-tabs.js" type="text/javascript"></script>\n<!-- USER JS -->\n</body>\n</html>