0

I have a CSV file with a lot of trailing spaces in some columns, so I'm trying to write a simple Windows Forms Application that loops through all lines in my CSV and removes the trailing spaces. I think I'm pretty far, but I have the feeling that C# sees this as one line:

"Lorem ipsum dolor sit amet","Default ",9999,1,"base "

While it's supposed to see this as one line:

"Lorem ipsum dolor sit amet"

This is the code I have right now:

String path = @"C:\Users\me\Documents\test.csv";

string[] allLines = File.ReadAllLines(path);
using (StreamWriter sw = new StreamWriter(path))
{
    foreach (string line in allLines)
    {
        if (!string.IsNullOrEmpty(line) && line.Length > 1)
        {                
            line.TrimEnd(' ');

            sw.WriteLine(line);
            //sw.WriteLine(line.TrimEnd(' '));
        }
    }
}
Console.WriteLine(allLines);
Console.WriteLine("Done");

How do I make sure that in the case my CSV file is like this:

"Lorem ipsum dolor sit amet ","Default ",9999,1,"base "
"simple","Default ",9999,1," base"
"test ","Default ",9999,1,"base"

It comes out after running the C# code like this (so without the spaces at the end and the beginning):

"Lorem ipsum dolor sit amet","Default ",9999,1,"base"
"simple","Default",9999,1,"base"
"test","Default",9999,1,"base"

Edit: I also tried doing line = line.TrimEnd(' '); but this gives me the error Cannot assign 'line' because it is a 'foreach iteration variable'.

Nkosi
  • 235,767
  • 35
  • 427
  • 472
user3478148
  • 433
  • 1
  • 8
  • 26

5 Answers5

5

Change:

line.TrimEnd(' ');
sw.WriteLine(line);

to:

sw.WriteLine(line.TrimEnd(' '));

TrimEnd does not edit the string directly - it returns a new string with the edit applied.

mjwills
  • 23,389
  • 6
  • 40
  • 63
3

There are two things happening here. First, you're trying to trim the empty space off the end of a field, not a line. You'll want to use a CSV parser here - see this question. (Don't be tempted to simply split the line on commas - you'll run into all kinds of issues with that).

Second, as others have mentioned, String.TrimEnd() does not mutate the string, it returns a new one. Use this function on your string field values and re-write the CSV content.

Jon B
  • 51,025
  • 31
  • 133
  • 161
1

You can use the TextFieldParser to read the data, trimming each field. You have to do a little extra work to write the fields back with quotes if they contain the delimiter"

Add a reference to Microsoft.VisualBasic.dll

using Microsoft.VisualBasic.FileIO;

void Main()
{
    var inputFilename = @"G:\Test\TestCsv.csv";
    var outputFilename = @"G:\Test\TestCsvOut.csv";

    using (var tfp = new TextFieldParser(inputFilename))
    using (var strm = new StreamWriter(outputFilename))
    {
        tfp.Delimiters = new string[] { "," };
        tfp.HasFieldsEnclosedInQuotes = true;
        tfp.TextFieldType = FieldType.Delimited;
        tfp.TrimWhiteSpace = true;

        while (!tfp.EndOfData)
        {
            string[] fields = tfp.ReadFields();
            //Add quotes to fields that contain commas
            for (int i = 0; i < fields.Length; i++)
            {
                if (fields[i].Contains(","))
                {
                    fields[i] = $"\"{fields[i]}\"";
                }
            }

            //string.Join to create a delimited string
            strm.WriteLine(string.Join(",", fields));
        }
    }
}
Chris Dunaway
  • 10,974
  • 4
  • 36
  • 48
0

As another user mentioned, splitting on the comma could be a problem, if your text itself contains commas. However, if it does not, this simple approach is adequate and easier:

Just use String.Split() on , to break up your line into fields, then trim each one and do your operations on the trimmed individual fields as shown:

   char[] delims = {','};  // the split delimiter
   string trimmedField = ""; // holds a single trimmed field

   string[] allLines = File.ReadAllLines(path);
    using (StreamWriter sw = new StreamWriter(path))
    {
        foreach (string line in allLines)
        {
            if (!string.IsNullOrEmpty(line) && line.Length > 1)
            {
                string[] fields = line.Trim().Split(delims); // split it
                foreach(string f in fields)
                {
                    trimmedField = f.Trim(); // trim this field
                    //... do your thing here...
                }
TomServo
  • 7,248
  • 5
  • 30
  • 47
  • That doesn't work. "Cannot assign 'line' because it is a 'foreach iteration variable'" – user3478148 Jul 10 '17 at 13:37
  • @user3478148 You are correct. See updated example. Keep a variable defined before to hold the trimmed line. – TomServo Jul 10 '17 at 13:40
  • Thanks. However this still trims the end of a line instead of the field in the CSV. – user3478148 Jul 10 '17 at 13:46
  • @user3478148 Yes, I see. Please look at updated example that splits your line into component fields using the comma as a delimiter, then trims each field of whitespace. Cheers! – TomServo Jul 10 '17 at 13:56
  • I think we're almost there! However this adds an enter after every field. I've tried this as well: `trimmedLine = f.TrimEnd(' ');` – user3478148 Jul 10 '17 at 14:14
  • @user3478148 Adds an enter? How so? I pulled this example from production code that doesn't return anything but the string needed. Can you show your updated code so I could take a closer look? – TomServo Jul 10 '17 at 14:17
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148788/discussion-between-user3478148-and-jlh). – user3478148 Jul 10 '17 at 14:18
0

If you actually dont need to split the string, just remove all occurrences of trailing spaces, you could use an approach similar to this

string[] allLines = {
        "\"Lorem ipsum dolor sit amet\",\"Default \",9999,1,\"base \"",
        "\"Lorem ipsum dolor sit amer \",\"Defaulr \",9999,1,\"baser \"" };

        for(int i=0;i<allLines.Length;i++)
        {
             allLines[i]= allLines[i].Replace(" \"","\"");
        }

    console.WriteLine(allLines);

If there may be more than one trailing space, wrap the Replace in a while loop

while(allLines[i].Contains(" \""))
             allLines[i]= allLines[i].Replace(" \"","\"");
MikNiller
  • 1,242
  • 11
  • 17