23

String.Replace doesn't seem to work properly when replacing a portion of an HTML file's content. For example, String.Replace replaces </body></html> with blah blah blah </body></html> html> - notice the second HTML closing tag is not properly closed and therefore shows up when the page is rendered in the browser by the user.

Anyone know why it's not working as intended?

StreamReader sr = fi.OpenText;
String fileContents = sr.ReadToEnd();
sr.close();
fileContents = fileContents.Replace("<body>", "<body onload='jsFx();' />");
fileContents = fileContents.Replace("</body>","blah blah blah </body>");

StreamWriter sw = new StreamWriter(fi.OpenWrite());
sw.WriteLine(contents);
sw.close();
SharpC
  • 6,974
  • 4
  • 45
  • 40
Joey
  • 231
  • 1
  • 2
  • 3
  • 1
    Can you provide an example of your source file? The code you've submitted *should* work as you describe. I do not see any reason you'd get an extra ` html>` bit... – Nate Dec 02 '10 at 21:01
  • 1
    Is there any chance that that extraneous tag is already in the input file? Also I notice in the code example that you have an auto closed body tag, is that right? – MrEyes Dec 02 '10 at 21:05
  • Nate - thanks for the quick reply and cleanup. Not actual code, but close enough to get my point across. – Joey Dec 02 '10 at 21:45

2 Answers2

56

I might rewrite your bit of code like this:

var fileContents = System.IO.File.ReadAllText(@"C:\File.html");

fileContents = fileContents.Replace("<body>", "<body onload='jsFx();' />"); 
fileContents = fileContents.Replace("</body>","blah blah blah </body>"); 

System.IO.File.WriteAllText(@"C:\File.html", fileContents);

I should note that this solution is fine for files of reasonable size. Depending on hardware, any thing under a few tens of MB. It loads the entire contents into memory. If you have a really large file you may need to stream it through a few hundred KB at a time to prevent an OutOfMemoryException. That makes things a bit more complicated, since you'd need to also check the break between each chunk to see if split your search string.

Nate
  • 30,286
  • 23
  • 113
  • 184
  • One great feature of this is it's the only answer I have seen anywhere that actually preserves the new line characters exactly as they were in the original file. I am reading a xaml file and this approach means that new line characters within elements that are split ofver several lines are kept in-tact! – Ewan Dec 18 '18 at 15:50
14

There's nothing wrong with string.Replace here.

What is wrong is that you're overwriting the file but not truncating it... so if you changed your writing code to just

sw.WriteLine("Start");

you'd see "Start" and then the rest of the file.

I would recommend that you use File.ReadAllText and File.WriteAllText instead (take the path from the FileInfo). That way:

  • It will completely replace the file, instead of just overwriting
  • You don't need to worry about closing the reader/writer/stream properly (which you're not doing now - if an exception occurs, you're leaving the reader or writer open)

If you really want to use the FileInfo methods, use FileInfo.Open(FileMode.Create) which will truncate the file.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Jon - Thanks for the quick answer and explanation. Please explain why I wouldn't need to close the reader/writer/stream in the above example. - I realize the code I provided is dirty. It's not copied from development, but rather just trying to get my question out. – Joey Dec 02 '10 at 21:52
  • 1
    @Joey: You only close them if there's no exception. You should use `using` statements to dispose of them whatever happens - it's the equivalent to try/finally. – Jon Skeet Dec 02 '10 at 21:55
  • @Joey a little late now, but note that your "new" start tags has an extra / in it, terminating the xml block. It should *not* have the closing slash: – sirthomas Nov 30 '13 at 11:20