-3

I have a HTML file with this text:

...
...
...weeks*<br><br><i>If Yes  
, please complete the MYD88 L265P Blood form.<br><br>Optional if the Follow-Up v  
isit date is on or after 9/13/2017 (Amendment #8)<br></i>*Offer1  
...
...

I want to remove everything that is between &lt;br&gt; and &lt;/i&gt;.

I am trying this but it's not working as the search needs to be performed in multiple lines

powershell -Command "(gc myFile.XLS) -replace '&lt;br&gt.*&lt;/i&gt;', '' | Out-File myFile1.XLS"
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
Kal
  • 9
  • 1
    "*it's not working as the search needs to be performed in multiple lines*" - and because XLS files aren't plain text, and because the Regex engine doesn't understand HTML entity encoding. – TessellatingHeckler Nov 16 '17 at 00:03

1 Answers1

0

How about this? This matches content in multiple lines in between 'i' tags.

[Regex]::Replace($(Get-Content .\myFile.XLS),'<br>.*<br>|<i>(?:.*\r?\n?)*<\/i>','') > myFile.XLS