2

I want to load a very big file 3GB of text ( not comma separated ) just a text so each line in the text become a record

my table structure should be like this

LoadedTable ID bigint identity TLine varchar(max)

I tried using SQL tasks to import data but always get this error

  • Executing (Error) Messages Error 0xc02020a1: Data Flow Task 1: Data conversion failed. The data conversion for column "Column 0" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.". (SQL Server Import and Export Wizard)

Error 0xc020902a: Data Flow Task 1: The "Source - NOTEEVENTS_csv.Outputs[Flat File Source Output].Columns[Column 0]" failed because truncation occurred, and the truncation row disposition on "Source - NOTEEVENTS_csv.Outputs[Flat File Source Output].Columns[Column 0]" specifies failure on truncation. A truncation error occurred on the specified object of the specified component. (SQL Server Import and Export Wizard)

Error 0xc0202092: Data Flow Task 1: An error occurred while processing file "E:\MyFile.txt" on data row 1. (SQL Server Import and Export Wizard)

Error 0xc0047038: Data Flow Task 1: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on Source - NOTEEVENTS_csv returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure. (SQL Server Import and Export Wizard)

How to fix this error and load each line as a record with ID shows the line order?

asmgx
  • 7,328
  • 15
  • 82
  • 143

1 Answers1

3

The max storage size for a VARCHAR(max) column (row) is 2Gb, as per the documentation:

varchar [ ( n | max ) ] Variable-size string data. Use n to define the string size in bytes and can be a value from 1 through 8,000 or use max to indicate a column constraint size up to a maximum storage of 2^31-1 bytes (2 GB).


(For demo purposes I've used the big.txt file from Peter Norvig's site)

Instead of using the user interface in SSMS, you can run a BULK INSERT query, like the one below:

SELECT
    ROW_NUMBER() over (ORDER BY (SELECT NULL)) ROW_NR
    , *
INTO MyTable
FROM OPENROWSET
    (BULK N'C:\..\Desktop\big.txt', FORMATFILE=N'C:\..\Desktop\big_format_file.xml') tmp

However, you need to do a few things before:

  1. Create a Format File, in order to specify the format (like you would specify a row terminator, column terminator for a Flat File import). (an easy workaround to generating this file at the bottom of my answer)

    I chose to create a .XML file format because it's easier to read. The content of this file is:

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\n" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="txt" xsi:type="SQLVARYCHAR"/>
 </ROW>
</BCPFORMAT>

Although this is a sample document, make sure that the TERMINATOR option has set the correct terminator that you expect to have as a row delimiter.

  1. After that, you can run the query above to import your data into MyTable.

    From this point on, you can either use SELECT INTO to persist the information into a new table, INSERT to add new rows to an existing table or even UPDATE to update columns into an existing table.

In my case, the contents of MyTable will look like:

enter image description here


An easy workaround to creating the Format File is to create a table in the database, with the structure you're expecting the input data to have is to:

  1. Create dummy table with the structure / definition (ex: big_bulk)

create table big_bulk (txt varchar(max))

  1. Run BCP command to generate the File Format out of this table:

bcp test.dbo.big_bulk format nul -c -x -f .\Desktop\big_format_file.xml -t, -T

The -t parameter in the bcp command above specifies the row delimiter. You can replace the comma ( "," ) with a "\t" or "\r" or other Field and Row Terminators

  1. Edit the XML file and make sure you set the correct row terminator/delimiter.

  2. Use the File Format in your OPENROWSET() query.

Radu Gheorghiu
  • 20,049
  • 16
  • 72
  • 107