1

I am trying to filter a csv file from quotation marks. This line strangely deletes all the quotation marks in the line:

before : "NewClient"Name"

foo = foo.Replace(foo.Substring(0, 1), "");

after : NewClientName

Why is this happening? Shouldn't the Replace() method just delete the first occurence?

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • 1
    Why don't you simply use `line=line.Replace("\"","")`? If you just want to remove the leading quotation mark: `line=line.TrimStart('"');` – Tim Schmelter Jan 11 '17 at 08:59
  • 1
    *"Shouldn't the replace method just delete the first occurence?"* No, [it replaces every occurence](https://msdn.microsoft.com/en-us/library/fk49wtc1(v=vs.110).aspx). – Manfred Radlwimmer Jan 11 '17 at 09:02
  • 1
    Do you really have `"NewClient"Name"` as an input? Usually we *double* quotations in CSV: `"NewClient""Name"` -> `NewClient"Name` when `"NewClient"Name"` is a *syntax error* – Dmitry Bychenko Jan 11 '17 at 09:02

4 Answers4

2

Usually when working with CSV we double quotations:

a                 -> "a"
a"b               -> "a""b"
NewClient"Name    -> "NewClient""Name"

To cut such a quotation, i.e.

"NewClient""Name" -> NewClient"Name

when "NewClient"Name" being a syntax error you can try

private static string CutQuotation(string value) {
  if (string.IsNullOrEmpty(value))
    return value;
  else if (!value.Contains('"'))
    return value;

  if (value.Length == 1)
    throw new FormatException("Incorrect quotation format: string can't be of length 1.");
  else if (value[0] != '"')
    throw new FormatException("Incorrect quotation format: string must start with \".");
  else if (value[value.Length - 1] != '"')
    throw new FormatException("Incorrect quotation format: string must end with \".");

  StringBuilder builder = new StringBuilder(value.Length);

  for (int i = 1; i < value.Length - 1; ++i) 
    if (value[i] == '"') 
      if (i == value.Length - 2)
        throw new FormatException("Incorrect quotation format. Dangling \".");
      else if (value[++i] == '"') 
        builder.Append(value[i]);
      else
        throw new FormatException("Incorrect quotation format. Dangling \".");
    else
      builder.Append(value[i]);

  return builder.ToString();
}

As you can see, it's not just single Replace() routine.

Tests:

 // abc - no quotation
 Console.WriteLine(CutQuotation("abc")); 
 // abc - simple quotation cut
 Console.WriteLine(CutQuotation("\"abc\"")); 
 // "abc" - double quotation
 Console.WriteLine(CutQuotation("\"\"abc\"\"")); 
 // a"bc - quotation in the middle
 Console.WriteLine(CutQuotation("\"a\"\"bc\"")); 
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
1

Shouldn't the replace method just delete the first occurence?

You would think so, because you're using Substring(0, 1):

string foo = "\"NewClient\"Name\"";
foo = foo.Replace(foo.Substring(0, 1), "");

But the substring is obtained before the call to Replace(). In other words, your Substring() and Replace() have no relation to each other whatsoever, so the code is equivalent to:

string foo = "\"NewClient\"Name\"";
foo  = foo.Replace("\"", "");

And String.Replace() replaces all occurrences.

If you only want to replace the first instance, see How do I replace the *first instance* of a string in .NET?. If instead you only and always want to remove the first character, see Fastest way to remove first char in a string. If you want to remove specific characters from the start of the string, see C# TrimStart with string parameter.

Community
  • 1
  • 1
CodeCaster
  • 147,647
  • 23
  • 218
  • 272
0

No, Replace method replace all occurrencies of a char in the string (see https://msdn.microsoft.com/en-us/library/fk49wtc1(v=vs.110).aspx).

If you need to remove just fist/last char (if any) you can use Trim function: https://msdn.microsoft.com/en-us/library/d4tt83f9(v=vs.110).aspx

in your case:

 result[i, 0] = result[i, 0].Trim('"')
Sierrodc
  • 845
  • 6
  • 18
0

Your requirement works with this.

string strTest = "\"NewClient\"Name\"";
strTest = strTest.TrimStart('"');

which in your case it is

result[i, 0] = result[i, 0].TrimStart('"');

*Replace will remove and replace all matching char/string in the given instance.

CodeCaster
  • 147,647
  • 23
  • 218
  • 272