This is the answer to the question. The given solution replaces a string by the given translation table.
It does not use an expensive regex function.
It also does not use multiple replacement functions that each individually did loop over the data with several checks etc.
So the search is done directly in 1 for loop. For the number of times that the capacity of the result array has to be increased, a loop is also used within the Array.Copy function. That are all the loops.
In some cases, a larger page size might be more efficient.
public static string NormalizeNewLine(this string val)
{
if (string.IsNullOrEmpty(val))
return val;
const int page = 6;
int a = page;
int j = 0;
int len = val.Length;
char[] res = new char[len];
for (int i = 0; i < len; i++)
{
char ch = val[i];
if (ch == '\r')
{
int ni = i + 1;
if (ni < len && val[ni] == '\n')
{
res[j++] = '\r';
res[j++] = '\n';
i++;
}
else
{
if (a == page) //ensure capacity
{
char[] nres = new char[res.Length + page];
Array.Copy(res, 0, nres, 0, res.Length);
res = nres;
a = 0;
}
res[j++] = '\r';
res[j++] = '\n';
a++;
}
}
else if (ch == '\n')
{
int ni = i + 1;
if (ni < len && val[ni] == '\r')
{
res[j++] = '\r';
res[j++] = '\n';
i++;
}
else
{
if (a == page) //ensure capacity
{
char[] nres = new char[res.Length + page];
Array.Copy(res, 0, nres, 0, res.Length);
res = nres;
a = 0;
}
res[j++] = '\r';
res[j++] = '\n';
a++;
}
}
else
{
res[j++] = ch;
}
}
return new string(res, 0, j);
}
The translation table really appeals to me even if '\n\r' is not actually used on basic platforms. Who would use two types of linebreaks for indicate 2 linebreaks?
If you want to know that, than you need to take a look before to know if the \n and \r both are used seperatly in the same document.