1

I want to add some text at a specific position in a specific line. This is what I got so far:

public void AddSomeTextToFile(string FilePath, int LineNumber, int IndexInLine, string TextToAdd)
{
    string[] WholeFile = File.ReadAllLines(FilePath);
    WholeFile[LineNumber - 1] = WholeFile[LineNumber - 1].Insert(IndexInLine, TextToAdd);
    File.WriteAllLines(FilePath, WholeFile);
}

This, code, however, has some issues with encoding.

For example, text that was Text before running the method becomes Text after method. I've tried using Encoding.UTF8 and Encoding.Unicode, both with no success.

Is there any way to insert some text into a file and preserve and special characters?


Solution

Based on floele's code, this is the code that solved my problems:

public void AddSomeTextToFile(string FilePath, int LineNumber, int IndexInLine, string TextToAdd)
{
    byte[] bytes = File.ReadAllBytes(FilePath);
    List<List<byte>> lines = bytes.SplitOn((byte)'\n').ToList();
    byte[] bytesToInsert = Encoding.Default.GetBytes(TextToAdd);
    lines[LineNumber - 1].InsertRange(IndexInLine, bytesToInsert);
    File.WriteAllBytes(FilePath, lines.SelectMany(x => x).ToArray());
}

static class EnumerableExtensions
{
    public static IEnumerable<List<T>> SplitOn<T>(
        this IEnumerable<T> source,
        T delimiter)
    {
        var list = new List<T>();
        foreach (var item in source)
        {
            if (delimiter.Equals(item))
            {
                list.Add(item);
                yield return list;
                list = new List<T>();
            }
            else
            {
                list.Add(item);
            }
        }
        yield return list;
    }
}
jacobz
  • 3,191
  • 12
  • 37
  • 61

2 Answers2

1

Encoding.UTF8 is actually the default encoding used by WriteAllLines and ReadAllLines. So if reading and writing using this encoding "corrupts" your data, you need to use a different one.

You need to determine what the original encoding of the file located at FilePath is and then specify it like this

File.ReadAllLines(FilePath, *encoding*);  
File.WriteAllLines(FilePath, WholeFile, *encoding*);

A likely encoding would be Encoding.Default (windows-1252), try it out. If that doesn't work, you have to check how the file is actually written before you append to it.

However, if it contains a lot of non-character data as your screenshots indicate, maybe you have to consider the file to be a "binary" type. In this case you should use ReadAllBytes / WriteAllBytes, split the file manually into lines (searching the byte array for \r\n) and then insert new data at the desired locations. You need to convert strings to a byte array for this purpose using Encoding.GetBytes("...") (using the right encoding).

Taking some code from another linked answer, full code for this would like:

static class MyEnumerableExtensions
{
    //For a source containing N delimiters, returns exactly N+1 lists
    public static IEnumerable<List<T>> SplitOn<T>(
        this IEnumerable<T> source,
        T delimiter)
    {
        var list = new List<T>();
        foreach (var item in source)
        {
            if (delimiter.Equals(item))
            {
                yield return list;
                list = new List<T>();
            }
            else
            {
                list.Add(item);
            }
        }
        yield return list;
    }
}

public InsertLine()
{       
    byte[] bytes = File.ReadAllBytes(...);
    List<List<byte>> lines = bytes.SplitOn((byte)'\n').ToList();
    string lineToInsert = "Insert this";
    byte[] bytesToInsert = Encoding.Default.GetBytes(lineToInsert);
    lines.Insert(2, new List<byte>(bytesToInsert));
    File.WriteAllBytes(..., lines.SelectMany(x => x).ToArray());
}
floele
  • 3,668
  • 4
  • 35
  • 51
  • Thanks for the answer! I've tried Encoding.Default. It nearly solves all the issues with special characters. Only one issue remains: when I compared the input and output files, I noticed that all Hex values that were `0A` were replaced with `0D 0A`. Is there any workaround for this? – jacobz Mar 22 '14 at 17:09
  • 1
    This is because WriteAllLines will write \r\n instead of only \n as in your input file. To work around this, you should check this question/answer: http://stackoverflow.com/questions/10988411/how-to-writealllines-in-c-sharp-without-crlf – floele Mar 22 '14 at 17:14
  • This solves the 0D 0A issue. However, in the test file I have there is one occurrence where `0D` was being replaced by just `0A`. Is there any way to solve this? – jacobz Mar 22 '14 at 18:04
  • I guess you can only solve this by doing it the binary way I mentioned earlier. You can't avoid certain modifications to the file as long as you are using the "text" methods. – floele Mar 22 '14 at 18:26
  • I don't really get what you mean by "split the file manually into lines". Can you please demonstrate the meaning by providing some code? Much appreciated! – jacobz Mar 22 '14 at 19:07
  • This might go a bit beyond this question...does this answer help? http://stackoverflow.com/questions/17754777/reading-a-binary-file-and-using-new-line-as-a-delimiter-to-create-binary-chunks – floele Mar 22 '14 at 19:28
  • Not really... That `SplitOn` method doesn't really seem to work. – jacobz Mar 22 '14 at 20:05
  • I updated my answer with the full code. Note that the way I do it is rather simplistic (whole file is loaded into memory), don't do this for very large files. – floele Mar 22 '14 at 20:28
  • Is it possible to split it on multiple bytes, i.e. not only on '\n', but also on '\r\n' and '\r'? – jacobz Mar 22 '14 at 20:57
  • Sure you got do that, but why? If you do so, however, you need to implement a mechanism to remember what split chars were used per line. – floele Mar 22 '14 at 22:19
  • The problem is that in my file, there are different line breaks used, `0A`, `0D` and `0D 0A`. Using just one delimiter would result in the other types of line breaks being overwritten. How do I implement a mechanism to remember what split chars were used? – jacobz Mar 23 '14 at 08:36
  • You could add a `list.Add(item);` inside `if (delimiter.Equals(item))`, this would leave all line break chars intact when joining them back together. – floele Mar 23 '14 at 09:57
  • This finally solved all my issues. Thank you very much! – jacobz Mar 23 '14 at 10:12
0

Why not use the StreamReader.ReadLine method for getting the lines in the file into an array, insert the lines you want and then use StreamWriter.WriteLine for writing the lines back.

http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline(v=vs.110).aspx http://msdn.microsoft.com/en-us/library/system.io.streamwriter.writeline(v=vs.110).aspx

vadrianc
  • 1
  • 1
  • 1