1

I have a #tmp table with a single field.

I have inserted values which look something like;

2015-01-01.124587.21654#I#U#my company#ltd#brown####day

They are '#' delimited and \r\n terminated.

I know, from what I am testing, I am unable to BULK insert from #tmp to my source table.

How can I quickly get at these fields?

I have a directory with 60 files that have headers and footers I don't want. Variable number of rows etc. So I have inserted them into a single field to avoid the continuing error I have been getting about unexpected end of file amongst others.

I have prepared a cursor to take in these files iteratively and remove head and tail then insert into their master table with correct headers. So far I am only at a temp table stage.

I have removed the rows which contain my headers and footers in the files. Now I need the columns into their correct table/fields.

Any pointers?

EDIT: Using the XML table valued function

ALTER FUNCTION [dbo].[SplitStrings_XML]
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN 
   (  
      SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
      FROM 
      ( 
        SELECT x = CONVERT(XML, '<i>' 
          + REPLACE(@List, @Delimiter, '</i><i>') 
          + '</i>').query('.')
      ) AS a CROSS APPLY x.nodes('i') AS y(i)
   );

How do I use this?

SELECT * FROM dbo.SplitStrings_XML(#tmp.Identifier, '#') 

??

    Msg 4104, Level 16, State 1, Line 48
The multi-part identifier "#tmp.Identifier" could not be bound.
Splunk
  • 491
  • 1
  • 9
  • 24
  • 2
    The best possible thing would be to not use delimited values. But if you have no choice you need to split them. http://sqlperformance.com/2012/07/t-sql-queries/split-strings – Sean Lange Oct 21 '15 at 13:31
  • For business reasons this is not going to be used as this will be supported by someone with not ETL skills – Splunk Oct 21 '15 at 13:42
  • can you write something not in pure sql? if the decision to do this is based on the maintainers skills, then NO solution will be fit in my opinion, because of the complexity of what you have to do. your maintainer needs some training :) – Jeremy Oct 21 '15 at 13:48
  • I could yes, what would you suggest I use to help? SSIS is off the cards and your point is valid about complexity. – Splunk Oct 21 '15 at 13:50
  • this could be done nicely with a small program in .net (either c# or vb depending on your comfort level with each) – Jeremy Oct 21 '15 at 13:52
  • 1
    http://stackoverflow.com/questions/10731179/bulk-insert-sql-server-millions-of-record also check out http://www.codeproject.com/Articles/439843/Handling-BULK-Data-insert-from-CSV-to-SQL-Server – Jeremy Oct 21 '15 at 13:54
  • Could you just remove the first & last row from the file before importing it? Then you could use # as field separator and wouldn't have these problems? Other way would be to calculate the number of rows and use first & last row parameters – James Z Oct 21 '15 at 13:55
  • This is an automated procedure that needs to run every day overnight. FTP new files middle of the night. Manually removing these seems simple enough but we want to avoid doing this for 60 or so files. I have tried xp_CMDShell to fire a batch process then trigger VBS. Process is too slow and it is difficult to track the thread to see when it is finished. – Splunk Oct 21 '15 at 13:58
  • Why is SSIS off the cards? The end user *doesn't* have to use SSIS itself, just put the file in the proper folder and hit a button in a desktop or web application that starts the import job in SQL Server Agent. If the end user needs to pass any parameters, both SSIS packages and Agent jobs can be run with parameters. Your application only have to pass the proper parameters to the job – Panagiotis Kanavos Oct 21 '15 at 14:01
  • What exactly is the issue? Are you asking how an IT operator would monitor your ETL job? You mention it's an automated process which means SSIS is a good match. Yet you talk about non-ETL users, when none are involved. Are you sure you aren't looking for a better SSIS execution report? Or better logging in the package itself? – Panagiotis Kanavos Oct 21 '15 at 14:05
  • @PanagiotisKanavos You are correct they do not. This is the only hurdle I need to get over and I can do everything I need in 3 SQL scripts as opposed to writing 5 SSIS packages. Simplicity and speed of delivery is here. Currently there is SSISDB configured or agreed data file setup. I am not in charge of security and have no control over the service accounts. I am working with minimal resources on a minimal landscape. I am against the clock. – Splunk Oct 21 '15 at 14:06
  • I'm talking 10+ million rows in 80 different files with heads and tails I need rid of. There are sensitive reasons to the business for this approach. Suffice it to say the blurring of the lines of support. – Splunk Oct 21 '15 at 14:08
  • @SeanLange already gave you the answer. Are you having trouble implementing it? It's very simple. – Tab Alleman Oct 21 '15 at 14:52
  • In honesty I am, but I am going to get it working – Splunk Oct 21 '15 at 14:56

1 Answers1

0

I managed to solve the problem by getting around the encoding and row terminator errors so no longer need to split a single column.

Splunk
  • 491
  • 1
  • 9
  • 24