Due to the ~2.15 billion element limitation with the .NET Framework (even taking into account 64bit Windows, .NET 4.5+, and gcAllowVeryLargeObjects), I needed to create my own BigStringBuilder to manipulate extremely large strings.
Unfortunately, now I need to use Regex on the class. It appears that code exists to operate a simpler Regex flavour on StringBuilders, though it's apparently not well tested and only supports *
(replace many chars) and ?
(replace single char).
And anyway, I'm not using a StringBuilder, as said, I'm using my own BigStringBuilder class where the fundamental underlying structure is a List of char arrays (i.e.: List<char[]> c = new List<char[]>();
). To retrieve any char within the giant string, a 'clever' indexer is used to access the rectangular structure:
// Indexer for class BigStringBuilder:
public char this[long n]
{
get { return c[(int)(n / pagesize)][n % pagesize]; }
set { c[(int)(n / pagesize)][n % pagesize] = value; }
}
It's not that 'clever' to be honest, but it does mean all the string data is potentially scattered across numerous char arrays within the List.
I am looking for the most effective way or any insights into allowing Regex (including Regex.Replace()
) to work in conjunction with this BigStringBuilder class, bearing in mind strings could be much bigger than 2GB.