0

I want to replace the delimiter comma with tabs in a CSV file

Input

enter image description here

Output

enter image description here

Note that commas shouldn't be replaced for words enclosed by quotes. Also in the output, we want to omit the double quotes

I tried the following, but the code also replaces commas for words enclosed by quotes

public void Replace_comma_with_tabs(string path)
{
   var file = File
              .ReadLines(path)
              .SkipWhile(line => string.IsNullOrWhiteSpace(line)) // To be on the safe side
              .Select((line, index) => line.Replace(',', '\t')) // replace ',' with '\t'
              .ToList();                  // Materialization, since we write into the same file
    
   File.WriteAllLines(path, file);
}

How can I skip commas for the words enclosed by quotes?

Gk_999
  • 508
  • 8
  • 29
  • 1
    @CodeCaster a single quote inside a quoted field isn't valid. – MikeJ Sep 23 '20 at 20:16
  • There are built-in methods for this kind of stuff... check out the [`TextFieldParser` that can be found in the `Microsoft.VisualBasic.FileIO` namespace](https://stackoverflow.com/a/48809517/395685). – Nyerguds Jul 01 '21 at 12:24

3 Answers3

1

There's a lot of ways to do this but here's one. This only includes the code to transform a string that has comma delimited text with quoted text. You'd use "ToTabs" instead of "Replace" inside your Select statement. You'll have to harden this to add some error checking.

This will handle escaped quotes inside of quoted fields and it transforms existing tabs to spaces, but it's not a full blown CSV parser.

static class CsvHelper
{
    public static string ToTabs(this string source)
    {
        Func<char,char> getState = NotInQuotes;
        char last = ' ';

        char InQuotes(char ch)
        {
            if ('"' == ch && last != '"')
                getState = NotInQuotes;
            else if ('\t' == ch)
                ch = ' ';

            last = ch;

            return ch;
        }

        char NotInQuotes(char ch)
        {
            last = ch;

            if ('"' == ch)
                getState = InQuotes;
            else if (',' == ch)
                return '\t';
            else if ('\t' == ch)
                ch = ' ';

            return ch;
        }
        return string.Create(source.Length, getState, (buffer,_) =>
        {
            for (int i = 0; i < source.Length; ++i)
            {
                buffer[i] = getState(source[i]);
            }
        });
    }
}

    static void Main(string[] _)
    {
        const string Source = "a,string,with,commas,\"field,with,\"\"commas\", and, another";

        var withTabs = Source.ToTabs();

        Console.WriteLine(Source);
        Console.WriteLine(withTabs);
    }
MikeJ
  • 1,299
  • 7
  • 10
1

Here is one way of doing it. It uses flag quotesStarted to check if comma should be treated as delimiter or part of the text in column. I also used StringBuilder since that class has good performance with string concatenation. It reads lines and then for each line it iterates through its characters and checks for those with special meaning (comma, single quote, tab, comma between single quotes):

    static void Main(string[] args)
    {
        var path = "data.txt";
        var file = File.ReadLines(path).ToArray();
        StringBuilder sbFile = new StringBuilder();
        foreach (string line in file)
        {
            if (String.IsNullOrWhiteSpace(line) == false)
            {
                bool quotesStarted = false;
                StringBuilder sbLine = new StringBuilder();
                foreach (char currentChar in line)
                {
                    if (currentChar == '"')
                    {
                        quotesStarted = !quotesStarted;
                        sbLine.Append(currentChar);
                    }
                    else if (currentChar == ',')
                    {
                        if (quotesStarted)
                            sbLine.Append(currentChar);
                        else
                            sbLine.Append("\t");
                    }
                    else if (currentChar == '\t')
                        throw new Exception("Tab found");
                    else
                        sbLine.Append(currentChar);
                }

                sbFile.AppendLine(sbLine.ToString());
            }
        }

        File.WriteAllText("Result-" + path, sbFile.ToString());
    }
Ivan Golović
  • 8,732
  • 3
  • 25
  • 31
0

To change commas in a string to tabs, use Replace method.

Example:

str2.Replace(",", "hit tab key");

string str = "Lucy, John, Mark, Grace";
string str2 = str.Replace(",", "    ");
Vega
  • 27,856
  • 27
  • 95
  • 103