1

We have records in this format:

99 0882300 25 YATES ANTHONY V MAY 01 12 04 123456 12345678

The width is fixed and we need to import it into SQL. We tried bulk import, but it didn't work because it's not ',' or '\t' separated. It's separated by individual spaces, of various lengths, in the text file, which is where our dilemma is located.

Any suggestions on how to handle this? Thanks!

AlG
  • 14,697
  • 4
  • 41
  • 54
evglynn
  • 307
  • 2
  • 6
  • 11
  • "99 0882300 25 YATES ANTHONY V MAY 01 12 04 123456 12345678".replace(/ /ig,',') Javascript.. replaces the spacec with comma. output is "99,0882300,25,YATES,ANTHONY,V,MAY,01,12,04,123456,12345678" is that what you want? – f2lollpll Mar 07 '12 at 20:54
  • 2
    what database platform? Oracle, Sybase, SQL Server, MySQL?! – ninesided Mar 07 '12 at 20:54
  • Do the columns themselves contain tab characters or commas? (I suspect not for tabs). If neither, then it is safe to collapse the whitespace into tabs or commas using a tool like `sed` – Michael Berkowski Mar 07 '12 at 20:55
  • 2
    @sjums I doubt that will work, in a fixed width format there are often spaces *in* the data :-/ – ninesided Mar 07 '12 at 20:56
  • 2
    I agree with what @ninesided said, but just want to follow up with "the way around this is divide them into fields based on specified column widths". Do you have the schema spec for this data source? – Ben Lee Mar 07 '12 at 21:02
  • Sounds like a perfect excuse to use a format file using character lengths or to import "as-is" and use some SUBSTRING-based SELECT statements. – wergeld Mar 07 '12 at 21:11
  • SQL*Loader can easily import fixed with files. –  Mar 07 '12 at 22:15
  • @a_horse_with_no_name that was my thoughts too, but unless we know the OP is using Oracle that's a non-starter! – ninesided Mar 08 '12 at 01:03

5 Answers5

10

question is pretty old but might still be relevant.

I had exactly the same problem as you. My solution was to use BULK INSERT, together with a FORMAT file.

This would allow you to:

  1. keep the code much leaner
  2. have the mapping for the text file to upload in a separate file that you can easy tweak
  3. skip columns if you fancy

To cut to the chase, here is my data format (that is one line)

608054000500SS001 ST00BP0000276AC024   19980530G10379    00048134501283404051N02912WAC 0024 04527N05580WAC 0024 1998062520011228E04ST 04856      -94.769323       26.954832     
-94.761114       26.953626G10379    183    1

And here is my SQL code:

BULK INSERT dbo.TARGET_TABLE
   FROM 'file_to_upload.dat' 
   WITH (
            BATCHSIZE = 2000,
            FIRSTROW = 1,
            DATAFILETYPE = 'char',
            ROWTERMINATOR = '\r\n',
            FORMATFILE = 'formatfile.Fmt'

        );

Please note the ROWTERMINATOR parameter set there, and the DATAFILETYPE.

And here is the format file

11.0
6
1   SQLCHAR 0   12  ""  1   WELL_API    SQL_Latin1_General_CP1_CI_AS
2   SQLCHAR 0   19  ""  2   SPACER1     SQL_Latin1_General_CP1_CI_AS
3   SQLCHAR 0   8   ""  3   FIELD_CODE  SQL_Latin1_General_CP1_CI_AS
4   SQLCHAR 0   95  ""  4   SPACER2     SQL_Latin1_General_CP1_CI_AS
5   SQLCHAR 0   5   ""  5   WATER_DEPTH SQL_Latin1_General_CP1_CI_AS
6   SQLCHAR 0   93  ""  6   SPACER3     SQL_Latin1_General_CP1_CI_AS

I put documentation links below, but what you must note is the following:

  1. the ""s in the 5th column, which indicates the separator (for a .csv would be obviously ","), which in our case is set to just "";
  2. column 2 is fully "SQLCHAR", as it's a text file. This must stay so even if the destination field in the data table is for example an integer (it is my case)

Bonus note: in my case I only needed three fields, so the stuff in the middle I just called "spacer", and in my format file gets ignored (you change numbers in column 6, see documentation).

Hope it answers your needs, works fine for me. Cheers

Documentation here: https://msdn.microsoft.com/en-us/library/ms178129.aspx https://msdn.microsoft.com/en-us/library/ms187908.aspx

vale_p
  • 115
  • 1
  • 9
  • 1
    Still relevant in 2018. Thanks =). Btw here is the documentation for .fmt files: https://learn.microsoft.com/en-us/sql/relational-databases/import-export/non-xml-format-files-sql-server?view=sql-server-2017#Structure – Gaspa79 May 02 '18 at 16:02
  • 2
    I heartily agree with @Gaspa79. Cheers, vale_p, you just saved my bacon. :) – Enrika Jun 10 '18 at 14:29
  • 1
    Thanks, was very useful. In my case however, the separator for the last column had to be "\r\n", or the import would shift the columns by two characters on every line. – Julen Dec 14 '20 at 09:31
9

When you feel more at home with SQL than importing tools, you could bulk import the file into a single VARCHAR(255) column in a staging table. Then process all the records with SQL and transform them to your destination table:

CREATE TABLE #DaTable(MyString VARCHAR(255)) 
INSERT INTO #DaTable(MyString) VALUES ('99 0882300 25 YATES ANTHONY V MAY 01 12 04 123456 12345678')

INSERT INTO FInalTable(Col1, Col2, Col3, Name)
SELECT CAST(SUBSTRINg(MyString, 1, 3) AS INT) as Col1,
    CAST(SUBSTRING(MyString, 4, 7) AS INT) as Col2,
    CAST(SUBSTRING(MyString, 12, 3) AS INT) as Col3,
    SUBSTRING(MyString, 15, 6) as Name
FROM #DaTable

result: 99  882300  25  YATES 
Wim
  • 1,058
  • 8
  • 10
2

To import from TXT to SQL:

CREATE TABLE #DaTable (MyString VARCHAR(MAX));

And to import from a file

BULK INSERT #DaTable
FROM'C:\Users\usu...IDA_S.txt'
WHITH
(
CODEPAGE = 'RAW'
)

3rd party edit

The sqlite docs to import files has an example usage to insert records into a pre-existing temporary table from a file which has column names in its first row:

sqlite> .import --csv --skip 1 --schema temp C:/work/somedata.csv tab1

My advice is to import the whole file in a new table (TestImport) with 1 column like this

sqlite> .import C:/yourFolder/text_flat.txt TestImport

and save it to a db file

sqlite> .save C:/yourFolder/text_flat_out.db

And now you can do all sorts of with it.

surfmuggle
  • 5,527
  • 7
  • 48
  • 77
Marco
  • 21
  • 1
  • This worked for me from a text file which notepad++ identifies as UTF-8-BOM. BULK INSERT dbo.DataTable FROM 'd:\HRF605_Support\HRF605 Q1 2021.txt' WITH ( CODEPAGE = 'RAW' ) – Allan F Apr 06 '21 at 00:14
0

Use SSIS instead. This is much clearer and has various options for the import of (text) files

0

I did this for a client a while back and, sad as it may seem, Microsoft Access was the best tool for the job for his needs. It's got support for fixed width files baked in.

Beyond that, you're looking at writing a script that translates the file's rows into something SQL can understand in an insert/update statement.

In Ruby, you could use the String#slice method, which takes an index and length, just like fixed width files' definitions are usually expressed in. Read the file in, parse the lines, and write it back out as a SQL statement.

jxpx777
  • 3,632
  • 4
  • 27
  • 43