0

I'm stuck on this situation and i do not really know what to do.

I'm trying to achieve this:

Read the "latest" (latest date with the higher hour) file within a directory, replace the spaces by adding "commas" and change the file extension: from TXT to CSV

I dont want to execute a query and fetch all the information, i just want to convert a file directly stored in a directory

My TXT looks like this:

POL-45345334234       -963747       -963747 $   KAN  98   HU 554534     2179007             2021 20210127 20210127       -963747              
POL-99345552342        628308        423818 $   KAN  98   JU 999898     1402048             2021 20190104 20190208        423818   30        0

I want to get rid of the spaces and obtain something like this:

POL-45345334234,-963747,-963747,$,KAN,98,HU,554534,2179007,2021,20210127,20210127,-963747,,              
POL-99345552342,628308,423818,$,KAN,98,JU,999898,1402048,2021,20190104,20190208,423818,30,0

image

How can i achive something like that using SSIS? I'm pretty much lost in here

JustToKnow
  • 785
  • 6
  • 23
  • Looks like a dupe of https://stackoverflow.com/questions/8831060/import-most-recent-csv-file-to-sql-server-in-ssis/8838029#8838029 to me – billinkc Mar 26 '21 at 16:41
  • Hey pal, how are you?. I dont think so, kinda related but not at all :) – JustToKnow Mar 26 '21 at 16:52
  • I am not certain, but I think your source is a fixed width file. You can easily read that with SSIS and either process it like that or write it to CSV. – KeithL Mar 29 '21 at 12:45

2 Answers2

2

This code gets you the latest text (.txt) file. You are going to run into problems with replacing sequential spaces with a comma though. Specifically, for nulls like the example you have in row 1.

Use a script task.

Add the following namespaces:

using System.IO;
using System.Linq;

Add this code:

        string filename = new DirectoryInfo(@"D:\").GetFiles()
                               .Where(e => e.Extension.ToLower() == ".txt")
                               .OrderByDescending(d => d.LastWriteTime)
                               .Select(f => f.FullName)
                               .First();

I'll give you the code to replace all spaces but it doesn't handle null records:

Add namespace:

System.Text.RegularExpressions;

Add code after above:

       using (StreamWriter sw = new StreamWriter(@"D:\output.csv"))
        {
            using (var fs = new FileStream(filename, FileMode.Open, FileAccess.Read))
            {
                using (var sr = new StreamReader(fs, Encoding.UTF8))
                {
                    string line = String.Empty;
                    while ((line = sr.ReadLine()) != null)
                    {
                        string newline = Regex.Replace(line, @"\s+", ",");
                        sw.WriteLine(newline);
                    }
                }
            }
        }
KeithL
  • 5,348
  • 3
  • 19
  • 25
  • Hey pal, thank you very much for replying! Yeah, i did the same in order to get the latest file but the problem comes when i try to handle null records. Been trying a lot without success. – JustToKnow Mar 28 '21 at 20:49
0

Read the latest file in your directory :

Create a C# Script Task :

public void Main()
         {

      // TODO: Add your code here
             var directory= new DirectoryInfo(Dts.Variables["User::VarFolderPath"].Value.ToString());

            FileInfo[] files = directory.GetFiles();
            DateTime lastModified = DateTime.MinValue;

             foreach (FileInfo file in files)
            {
                if (file.LastWriteTime > lastModified)
                {
                    lastModified = file.LastWriteTime;
                    Dts.Variables["User::VarFileName"].Value = file.ToString();
                }
            }

             MessageBox.Show(Dts.Variables["User::VarFileName"].Value.ToString());


             Dts.TaskResult = (int)ScriptResults.Success;
         }

Where :

  • VarFolderPath is a user variable pointing to your folder

  • VarFileName is a user variable pointing to the latest file in the directory

Create a staging table to get data from the file after the Script Task : you can store the content of a row in a column called [Column 0]

Consider relacing sequential spaces by one space :

SELECT  REPLACE(REPLACE(REPLACE([Column 0],' ','<>'),'><',''),'<>',' ')
  FROM yourStagingTable

Consider replacing the spaces by commas using a derived column :

REPLACE([Column 0 ]," ",",")

enter image description here

Amira Bedhiafi
  • 8,088
  • 6
  • 24
  • 60
  • Hey pal, thanks for replying! I do not fully understand how are you treating null records, take a look at my thread. – JustToKnow Mar 28 '21 at 20:51