This is a quick way to do that, I mean.
It does not use an expensive regex function.
It also does not use multiple replacement functions that each individually did loop over the data with several checks, allocations, etc.
So the search is done directly in one for
loop. For the number of times that the capacity of the result array has to be increased, a loop is also used within the Array.Copy
function. That are all the loops.
In some cases, a larger page size might be more efficient.
public static string NormalizeNewLine(this string val)
{
if (string.IsNullOrEmpty(val))
return val;
const int page = 6;
int a = page;
int j = 0;
int len = val.Length;
char[] res = new char[len];
for (int i = 0; i < len; i++)
{
char ch = val[i];
if (ch == '\r')
{
int ni = i + 1;
if (ni < len && val[ni] == '\n')
{
res[j++] = '\r';
res[j++] = '\n';
i++;
}
else
{
if (a == page) // Ensure capacity
{
char[] nres = new char[res.Length + page];
Array.Copy(res, 0, nres, 0, res.Length);
res = nres;
a = 0;
}
res[j++] = '\r';
res[j++] = '\n';
a++;
}
}
else if (ch == '\n')
{
int ni = i + 1;
if (ni < len && val[ni] == '\r')
{
res[j++] = '\r';
res[j++] = '\n';
i++;
}
else
{
if (a == page) // Ensure capacity
{
char[] nres = new char[res.Length + page];
Array.Copy(res, 0, nres, 0, res.Length);
res = nres;
a = 0;
}
res[j++] = '\r';
res[j++] = '\n';
a++;
}
}
else
{
res[j++] = ch;
}
}
return new string(res, 0, j);
}
I now that '\n\r' is not actually used on basic platforms. But who would use two types of linebreaks in succession to indicate two linebreaks?
If you want to know that, then you need to take a look before to know if the \n and \r both are used separately in the same document.