5

I have the following Powershell script:

$oldCode =  @"
            <div id="time_estimate">
                <!-- some large table -->
            </div>
"@

$newCode = @"
            <div id="time_estimate">
                                <!-- nested divs and spans -->
                                <div id="contact-form">

                                        <?php include "contact-form.php"; ?>
                                </div>
                        </div>
"@

ls *.html | foreach { 
        $fileContent = [System.Io.File]::ReadAllText($_.FullName)
        $newFileContent = $fileContent.Replace($oldCode, $newCode)
        [System.Io.File]::WriteAllText($_.FullName, $newFileContent)
        Write-Host  "`r`n"
        Write-Host  "Processed - $($_.Name)...`r`n" }

This doesn't seem to be replacing the text. Is it an issue with the multiline strings, or the limits of the Replace() method? I would prefer to do the replace without bringing in regex.

KalenGi
  • 1,766
  • 4
  • 25
  • 40

3 Answers3

5

What version of PowerShell are you using? If you're using v3 or higher, try this:

ls *.html | foreach { 
    $fileContent = Get-Content $_.FullName -Raw
    $newFileContent = $fileContent -replace $oldCode, $newCode
    Set-Content -Path $_.FullName -Value $newFileContent
    Write-Host  "`r`n"
    Write-Host  "Processed - $($_.Name)...`r`n" 
}
KevinD
  • 3,023
  • 2
  • 22
  • 26
  • This is pretty much how I would approach the issue too. The `-Raw` switch is a nice addition to Powershell v3. In v2, I often relied on `Get-Content | Out-String` to achieve similar results. There's a little more detail in [this answer](http://stackoverflow.com/a/11016718/316621) to a related question. – ajk Dec 17 '13 at 19:33
  • I upgraded to version 3 and applied this method – KalenGi Dec 17 '13 at 23:36
  • 1
    If anyone wonders about a one-liner for ONE file it could go like this: `(Get-Content -raw file.txt) -replace $oldCode, $newCode | Set-Content file.txt` – Henrik Sep 18 '19 at 12:05
1

For Pete's sake, don't even think about using regex for HTML.

The problem you met is that reading a file will provide you an array of strings. Replace() doesn't know about arrays, so you got to work it by hand. You could create a big string with -join like so,

$fileContent = [System.Io.File]::ReadAllText($_.FullName)
$theOneString = $fileContent -join ' '
$theOneString.Replace($foo, $bar)

... But this will mess up your line breaks. Then again, you could reformat the string with HTML Tidy.

The manual way is to iterate the source array row by row. Until you find the <div>, copy the contents into new destination array. After finding the replacable part, insert rest of the new stuff into the destination array. Keep reading and discarding the source array untill you find the </div> and copy all the rest into the destination array. Finally save the destination array's contents and you are done.

Community
  • 1
  • 1
vonPryz
  • 22,996
  • 7
  • 54
  • 65
  • 1
    Thanks very much for the link to that hilarious answer! The laughter has helped quite a bit... – KalenGi Dec 17 '13 at 21:47
1

I wouldn't use string replacements for modifying HTML code. To many things that could develop in unexpected directions. Try something like this:

$newCode = @"
<!-- nested divs and spans -->
<div id="contact-form">
  <?php include "contact-form.php"; ?>
</div>
"@

Get-ChildItem '*.html' | % {
  $html = New-Object -COM HTMLFile
  $html.write([IO.File]::ReadAllText($_.FullName))
  $html.getElementById('time_estimate').innerHTML = $newCode
  [IO.File]::WriteAllText($_.FullName, $html.documentElement.outerHTML)
}

If needed you can can prettify the HTML by using Tidy:

$newCode = @"
<!-- nested divs and spans -->
<div id="contact-form">
  <?php include "contact-form.php"; ?>
</div>
"@

[Reflection.Assembly]::LoadFile('C:\path\to\Tidy.dll') | Out-Null
$tidy = New-Object Tidy.DocumentClass

Get-ChildItem '*.html' | % {
  $html = New-Object -COM HTMLFile
  $html.write([IO.File]::ReadAllText($_.FullName))
  $html.getElementById('time_estimate').innerHTML = $newCode
  $tidy.ParseString($html.documentElement.outerHTML)
  $tidy.SaveFile($_.FullName) | Out-Null
}
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
  • This actually works, but messes with the formatting of the document by stripping out leading whitespace. – KalenGi Dec 17 '13 at 23:34