1

I have a C# script which takes in two CSV files as input, combines the two files, performs numerous calculations on them, and writes the result in a new CSV file. These two input CSV file names are declared as variables and are used in the C# script by accessing those variable names.

The data in the input CSV files looks like this: enter image description here

Since the data has values in thousands and millions, line splits in the C# code are truncating the data incorrectly. For instance a value of 11,861 appears only as 11 and 681 goes in the next columns.

Is there any way in C#, by which I can specify a text qualifier (" in this case) for the two files ?

Here is the C# code snippet:

 string[,] filesToProcess = new string[2, 2] { {(String)Dts.Variables["csvFileNameUSD"].Value,"USD" }, {(String)Dts.Variables["csvFileNameCAD"].Value,"CAD" } };
string headline = "CustType,CategoryType,CategoryValue,DataType,Stock QTY,Stock Value,Floor QTY,Floor Value,Order Count,Currency";
string outPutFile = Dts.Variables["outputFile"].Value.ToString();
//Declare Output files to write to
FileStream sw = new System.IO.FileStream(outPutFile, System.IO.FileMode.Create);
StreamWriter w = new StreamWriter(sw);
w.WriteLine(headline);

//Loop Through the files one by one and write to output Files
for (int x = 0; x < filesToProcess.GetLength(1); x++)
{                
    if (System.IO.File.Exists(filesToProcess[x, 0]))
    {
        string categoryType = "";
        string custType = "";
        string dataType = "";
        string categoryValue = "";

        //Read the input file in memory and close after done
        StreamReader sr = new StreamReader(filesToProcess[x, 0]);
        string fileText = sr.ReadToEnd();
        string[] lines = fileText.Split(Convert.ToString(System.Environment.NewLine).ToCharArray());
        sr.Close();                   

where csvFileNameUSD and csvFileNameCAD are variables with values pointing to their locations.

Community
  • 1
  • 1
user2673722
  • 295
  • 2
  • 6
  • 15
  • It sounds like you are looking for a CSV parser built in C#. Try this thread: http://stackoverflow.com/questions/2081418/parsing-csv-files-in-c-sharp. Though doesn't SSIS have a CSV file source? You could just let SSIS parse the files and then pass the values to your C# control flow item (http://blog.sqlauthority.com/2011/05/12/sql-server-import-csv-file-into-database-table-using-ssis/). – Dean May 12 '15 at 17:29
  • SSIS knows how to process CSV files. Why not let it do that? – John Saunders May 12 '15 at 17:44
  • I have edited your title. Please see, "[Should questions include “tags” in their titles?](http://meta.stackexchange.com/questions/19190/)", where the consensus is "no, they should not". – John Saunders May 12 '15 at 17:44
  • @Dean - Hi, thanks for the response, but I was more inclined to avoid using an external library. SSIS sure has a flat file manager but in my C # script, I'm taking in data line by line and doing operations and I'm a SSIS newbie, so idk how to parse the file first and then pass the result to the C# script. – user2673722 May 12 '15 at 17:46
  • @John - Thanks for the edit, apologies for the mistake. – user2673722 May 12 '15 at 17:51
  • OK... so you get getting your array of lines (`string[] lines`) just fine but the problem is: you need to parse that array of lines and honor the text qualifier? In your CSV files are the values always qualified with "? – Dean May 12 '15 at 18:02
  • @Dean - Exactly. Yeah, the files are always of the format indicated in the snippet attached above. – user2673722 May 12 '15 at 18:03
  • I would still go with using the standard SSIS Flat File source. Way more easier to manage and control. – rvphx May 12 '15 at 18:23
  • Slow down and take the time to learn how SSIS works. If I were reviewing your work, I'd make you go back and learn how data flows work, then have you create a script transform for any work that still needs to be done in script. – John Saunders May 12 '15 at 18:33

2 Answers2

2

Well, based on the questions you have answered, this ought to do what you want to do:

public void SomeMethodInYourCodeSnippet()
{
    string[] lines;
    using (StreamReader sr = new StreamReader(filesToProcess[x, 0]))
    {
        //Read the input file in memory and close after done
        string fileText = sr.ReadToEnd();
        lines = fileText.Split(Convert.ToString(System.Environment.NewLine).ToCharArray());
        sr.Close();  // redundant due to using, but just to be safe...
    }

    foreach (var line in lines)
    {
        string[] columnValues = GetColumnValuesFromLine(line);
        // Do whatever with your column values here...
    }
}
private string[] GetColumnValuesFromLine(string line)
{
    // Split on ","
    var values = line.Split(new string [] {"\",\""}, StringSplitOptions.None);
    if (values.Count() > 0)
    {
        // Trim leading double quote from first value
        var firstValue = values[0];
        if (firstValue.Length > 0)
            values[0] = firstValue.Substring(1);

        // Trim the trailing double quote from the last value
        var lastValue = values[values.Length - 1];
        if (lastValue.Length > 0)
            values[values.Length - 1] = lastValue.Substring(0, lastValue.Length - 1);
    }
    return values;
}

Give that a try and let me know how it works!

Dean
  • 2,084
  • 17
  • 23
  • That `StreamReader` needs to be in a `using` block. – John Saunders May 12 '15 at 19:33
  • @JohnSaunders That is actually the OP's code and I copied/pasted it in for context. I was trying to address his question, not teach him about something totally unrelated that he didn't ask about. – Dean May 12 '15 at 19:45
  • Furthermore, calling `Close()` disposes it as well (which is the reason you would use a `using` block): http://stackoverflow.com/questions/7524903/should-i-call-close-or-dispose-for-stream-objects – Dean May 12 '15 at 19:49
  • @Dean - Thanks a ton for the code snippet. I tried running it, but "Count()" initially created a compile error, so I switched it with Length. After that, on executing the package, I get the DTS Script Task RunTime Error:- DTS Script Task has encountered an exception in user code; Exception has been thrown by a target of invocation. Any ideas on what might have gone wrong? – user2673722 May 12 '15 at 20:21
  • Dean, people copy and paste the code we post, quite often without understanding it. It's necessary for us to not repost the mistakes made by the OP. And `using` ensures that `Dispose` will be called, even if an exception is thrown. – John Saunders May 12 '15 at 20:25
  • OK... so then is it the job when answering questions to look over the code posted and teach people best practices on how to program C#? Or is it to answer specific questions when asked? I have seen people get downvoted here because they were asking more "how do I program??" type questions so I have tried to be more specific when answering questions. And now you're downvoting me because I answered the question but I didn't address something *completely* unrelated to the question. And for the record: my original comment was that the OP should just use built in SSIS flat file source. – Dean May 12 '15 at 20:29
  • You supplied a bad example. You have corrected that, so I have upvoted. The goal of this site is not just to answer questions, but rather to create a repository of good answers for future readers to find. In your case, they would have found a bad practice before your edit. Now they won't. – John Saunders May 12 '15 at 21:32
  • @Dean - Hi, sorry, it took me sometime to resolve compilation errors in my script. So, things run fine except that values like "11,681.54" get converted to "11 681.54". Do you think this is related to the StringSplitOptions used in your script ? – user2673722 May 13 '15 at 15:02
  • I added another replace function for the spacing issue and everything ended up working fine. – user2673722 May 13 '15 at 15:57
  • @user2673722 I am passing in `StringSplitOptions` there simply because there is no overload for `string.Split` that takes a `string[]` and no `StringSplitOptions`. `None` is the default so it doesn't change the behavior and it will not not replace commas with spaces. I entered the CSV from your screenshot in to a file, ran the code above against it, looked at the output and verified that my values have commas intact as I expected them to. I wonder if you are running in to a text encoding issue. Have you set a break point in your code before `string.split` happens to see if you have commas? – Dean May 13 '15 at 15:58
  • @Dean - I'm unable to put breakpoints anywhere in the code, even on the public void main(). – user2673722 May 13 '15 at 18:20
0

You posted a very similar looking question few days ago. Did that solution not help you?

If so, what issues are you facing on that. We can probably help you troubleshoot that as well.

Community
  • 1
  • 1
rvphx
  • 2,324
  • 6
  • 40
  • 69
  • The solution there worked with specifying the text qualifier in the SSIS flat file manager. In this case, I want to specify the text qualifier in C# code itself, so when it's read line by line,correct values are used. – user2673722 May 12 '15 at 18:46
  • 1
    Not quite getting it why the flat file manager wont be able to do it. If the flat file has carriage return at end of the lines, it will automatically divide the columns based on the text qualifier. I have been doing that for a very long time without any issues. – rvphx May 12 '15 at 19:53