12

I'm trying to import the below excel file present in the azure blob storage into sql server

EXCEL File

enter image description here

Query

SELECT * 
    FROM OPENROWSET(
        BULK 'container/testfile.xlsx', 
        DATA_SOURCE = 'ExternalSrcImport',
        FORMATFILE='container/test.fmt', FORMATFILE_DATA_SOURCE = 'ExternalSrcImport',
        codepage = 1252,
        FIRSTROW = 1
        ) as data

Format file

10.0  
4  
1       SQLCHAR       0       7       "\t"     1     DepartmentID     ""  
2       SQLCHAR       0       100     "\t"     2     Name             SQL_Latin1_General_CP1_CI_AS  
3       SQLCHAR       0       100     "\t"     3     GroupName        SQL_Latin1_General_CP1_CI_AS  
4       SQLCHAR       0       24      "\r\n"   4     ModifiedDate     ""  

Illustration of Format File

enter image description here

when I execute the query, I'm getting the below error

Msg 4863, Level 16, State 1, Line 210 Bulk load data conversion error (truncation) for row 1, column 1 (DepartmentID).

looks like field terminator in the format file is not working, any ideas to import the file ?

Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
  • Seems You are providing Extra lenght's data than your column support. – mkRabbani May 16 '19 at 06:31
  • @mkRabbani No, I'm not. I have attached the excel i'm trying to upload, please check. – Pரதீப் May 16 '19 at 06:41
  • Hi what is your FORMATFILE_DATA_SOURCE = 'ExternalSrcImport' definition ? – Sanpas May 16 '19 at 06:41
  • Hi i think you have to define relevant column definition in FORMATFILE_DATE_SOURCE (https://learn.microsoft.com/en-us/sql/relational-databases/import-export/use-a-format-file-to-bulk-import-data-sql-server?view=sql-server-2017) Please share the container/test.fmt content – Sanpas May 16 '19 at 06:44
  • @pascalsanchez Question has the format file.. – Pரதீப் May 16 '19 at 06:47
  • @Pரதீப் yes it's "container/test.fmt " but you doesn't share the content no ? – Sanpas May 16 '19 at 06:47
  • @Pரதீப் can you upload your 'container/testfile.xlsx' and 'container/test.fmt' files ? – Sanpas May 16 '19 at 06:50
  • @pascalsanchez buddy, content is present in the question. Check this https://imgur.com/a/KeKo7S3 – Pரதீப் May 16 '19 at 06:51
  • @Pரதீப் yes it's the content off Excel file but i'm speak about you definition file (fmt) content file. Otherwise try to convert your Excel file to bcp or txt or csv , because excel (XLSX) have some special Ms char and encoding. – Sanpas May 16 '19 at 06:54
  • @Pரதீப் Does the same occurs if you save file as `.xls` format? – Lukasz Szozda May 20 '19 at 14:28
  • @Pரதீப், can you create a "staging" table where all columns have type `NVARCHAR(4000)` and load data there. Then by querying it, perhaps it becomes clear what value is the reason for that error message – Alexander Volok May 23 '19 at 09:22

1 Answers1

4

Your format file is representing import of a tab separated values file, but in the source path you are referring to an xslx file.
Xslx file is an ZIP archive of multiple XML files, bulk import will not be able to process it. To open it you need to use Microsoft Jet or ACE driver, you have some examples here: using-openrowset-to-read-excel. You will need to download file from blob storage to local disks before processing it. You can use SQL Agent or SSIS to download it.

Other option will be to save your data as CSV or tab separated file and load it directly from blob storage.

Piotr Palka
  • 3,086
  • 1
  • 9
  • 17
  • Thanks for this, but we need to use `BULK` option to import files from blob storage. Only under BULK option we can mention `DATA_SOURCE` – Pரதீப் May 22 '19 at 06:59
  • To use BULK import you have to save your data as CSV, tab separated file or something similar supported. Please also explain why do you have to mention DATA_SOURCE - Jet and ACE have equivalent options. – Piotr Palka May 22 '19 at 14:32
  • `DATA_SOURCE` is the only way to import files from Azure Blob storage. – Pரதீப் May 22 '19 at 16:45
  • There is many options available. From Microsoft doc: Users or client applications can access objects in Blob storage via HTTP/HTTPS, from anywhere in the world. Objects in Blob storage are accessible via the Azure Storage REST API, Azure PowerShell, Azure CLI, or an Azure Storage client library. Client libraries are available for a variety of languages, including .NET, Java, Node.js, Python, Go, PHP, and Ruby. – Piotr Palka May 22 '19 at 16:52
  • I'm trying to do this in Sql Server. In Sql Server we can use `OPENROWSET` or `BULK INSERT` both requires `DATA_SOURCE` to import files from Blob storage. – Pரதீப் May 23 '19 at 08:20
  • Could you consider then CLR stored procedure? With OPENROWSET or BULK INSERT you will have to use different format than xslx. – Piotr Palka May 23 '19 at 14:41