Although quite late to answer, found the question interesting and up-voted because IMO it's surprising there's no built-in way to keep white space under the described conditions.
So assuming the same input as the question, with an added line to also keep the double quote escape character (an immediately following double quote):
1 , 2
" 1 ", " 2 "
" a ""quoted"" word ", " hello world "
Set HasFieldsEnclosedInQuotes
to false, and deal with any field that is enclosed in quotes using a simple Regex
:
var separator = new string('=', 40);
Console.WriteLine(separator);
// demo only - show the input lines read from a text file
var text = File.ReadAllText(inputPath);
var lines = text.Split(
new string[] { Environment.NewLine },
StringSplitOptions.None
);
using (var textReader = new StringReader(text))
{
using (var parser = new TextFieldParser(textReader))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.TrimWhiteSpace = true;
parser.HasFieldsEnclosedInQuotes = false;
// remove double quotes, since HasFieldsEnclosedInQuotes is false
var regex = new Regex(@"
# match double quote
\""
# if not immediately followed by a double quote
(?!\"")
",
RegexOptions.IgnorePatternWhitespace
);
var rowStart = 0;
while (parser.PeekChars(1) != null)
{
Console.WriteLine(
"row {0}: {1}", parser.LineNumber, lines[rowStart]
);
var fields = parser.ReadFields();
for (int i = 0; i < fields.Length; ++i)
{
Console.WriteLine(
"parsed field[{0}] = [{1}]", i,
regex.Replace(fields[i], "")
);
}
++rowStart;
Console.WriteLine(separator);
}
}
}
OUTPUT:
========================================
row 1: 1 , 2
parsed field[0] = [1]
parsed field[1] = [2]
========================================
row 2: " 1 ", " 2 "
parsed field[0] = [ 1 ]
parsed field[1] = [ 2 ]
========================================
row 3: " a ""quoted"" word ", " hello world "
parsed field[0] = [ a "quoted" word ]
parsed field[1] = [ hello world ]
========================================