0

Alright so I am not sure how to go about this, I have files that will be coming in a format like this that I need to read into a SQL Server database: enter image description here

As you can see, it is "~" delimited and it contains no columns names at all. I will have multiple files like this incoming every couple of hours and I have the entire SSIS set up ready besides the part where I actually need to read the data because I am confused on how to handle this delimiter format that other department came up with. As you can see if I specify the column delimiter just to be "~" it works fine enter image description here

until it reaches that point where the row ends at which point there is this unnecessary row of "~" that starts and it confuses the connection manager into thinking these are separate columns, creating a bunch of empty columns enter image description here

I can't simply delete all empty columns because some legit columns can sometimes come in as empty. The only mediocre solution I found so far is to go to advanced options in file connection manager and manually delete all of the columns I don't need. But the reason this will not work at all is because next file I will get might contain more rows than this one and it will still think that "~" after every data row is a column delimiter when in reality it is just a row separator. The number of columns however will always remain static in each file.

Coda759
  • 107
  • 14
  • *.csv/txt files always have host of problems where column delimiters, column separators, invisible characters like null terminators \0, special characters based on encoding, and line breaks are in the middle of the actual data. The most reliable format for data feeds is XML enforced by XSD. An XSD (as a data contract between sender and receiver) will guarantee data quality, and prevent unexpected changes. – Yitzhak Khabinsky Oct 19 '21 at 19:43
  • @YitzhakKhabinsky problem is this is for my job, and there is no way they are changing this format can I perhaps convert the files I receive from txt to XML and try doing it this way somehow – Coda759 Oct 19 '21 at 19:44
  • My comment is for the real world scenarios. Otherwise, you will suffer. SSIS has **XML Source Adaptor** for such tasks. – Yitzhak Khabinsky Oct 19 '21 at 19:47
  • @YitzhakKhabinsky I regret to inform you and myself that this is actually for a real world scenario, its just the way that department does things – Coda759 Oct 19 '21 at 19:47
  • @YitzhakKhabinsky CSVs can at least be quote identified, and delimiters can be chosen well. They are, in my experience, vastly better than such things like Excel documents that make more wrong assumptions about data types that you would begin to think the product is completely broken. – Thom A Oct 19 '21 at 19:52
  • As for the problem, it sounds like the file uses a Tilda (`~`) as both a column and row delimiter; that is simply flawed. You'd have to use a programmatical solution which iterates through each delimiter instead. Look into Script Transformations. – Thom A Oct 19 '21 at 19:53
  • 1
    It's not clear to me whether there is a valid row delimiter. Are you jou getting the correct rows OK, you're just getting a bunch of empty columns? Just ignore them? Or is it true that the tilde is both column and row delimiter? That is a very dopey format. In this case I suggest using Powershell or an SSIS script component to process it into a sensible tabular shape. It would help if you could post sample data as text – Nick.Mc Oct 19 '21 at 22:28

0 Answers0