1

I have a self generated HTML file (in a local directory) with all the body on one line:

<html><head><META http-equiv="Content-Type" content="text/html; charset=UTF-8"><title>server - path</title></head><body><H1>server - path</H1><hr>

<pre><A HREF="/logs/folder/">[To Parent Directory]</A><br><br>         jeudi 5 janvier 2017    19:38       116483 <A HREF="/folder/file1.csv">file1.csv</A><br>         jeudi 5 janvier 2017    19:39       138397 <A HREF="/folder/file2.csv">file2.csv</A></A><br></pre><hr></body></html>

And I need to extract the name of the file and date. I succeed to read the right line. But I'm blocked to split the line on <br>.

I try something like this:

$string = "first line<br>second line <br> third line<br> end<br>"
write-host $string
$separator = "<br>"
$option = [System.StringSplitOptions]::RemoveEmptyEntries
$string.Split($separator, $option)

But I have that for result :

first line<br>second line <br> third line<br> end<br>
fi
st line
second line
thi
d line
end

I see the HTML Agility Pack, but in my case, I don't have any tag in my page.

Do you have any advice? Thanks!

Bernard Vander Beken
  • 4,848
  • 5
  • 54
  • 76
Seishiro
  • 11
  • 4
  • This post seems to answer your question: http://stackoverflow.com/questions/16435240/how-to-split-string-by-string-in-powershell – c3st7n Jan 20 '17 at 09:17

1 Answers1

3

The String.Split() method takes your string <br> and treats it as a [char] array, splitting on every single occurrence of either <, b ,r and >.

Use the regex-based -split operator instead:

PS C:\> $String -split $separator |Where-Object {$_}
first line
second line 
 third line
 end

The Where-Object {$_} pipeline element will filter out empty strings

Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206