10

did you ever get this error when trying to bulk insert a .tsv file into sql-server-linux docker image:

Referenced external data source "(null)" not found.

Here's the command, table exists, file exists on server:

BULK INSERT countries FROM '/import/file.tsv'
WITH (
  FIELDTERMINATOR = '\t'
);

Thank you Thomas

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
Tom Zayouna
  • 111
  • 1
  • 5
  • There is no SQL Server docker image for Linux unless you mean the SQL Server for Linux preview release. In which case you should specify that in the header and tags. The path you specified is invalid for Windows – Panagiotis Kanavos Dec 30 '16 at 09:45
  • This isn't a preview release, https://hub.docker.com/r/microsoft/mssql-server-linux/ or it's not specified. Nevermind ... did you ever get this error? – Tom Zayouna Dec 30 '16 at 09:57
  • `Microsoft SQL Server on Linux for Docker Engine` that *is* a preview release - actually a CTP - Customer Technology Preview. At this point the version is `SQL Server vNext CTP 1.1`. It's a stage *before* Preview. There isn't even an SO tag for it yet – Panagiotis Kanavos Dec 30 '16 at 10:02
  • Ok great, title changed, tags changed ... Did you ever encounter this error before? – Tom Zayouna Dec 30 '16 at 10:05
  • I understand, but what I don't understand is having such errors with using rather simple SQL commands, even with an extremely early version as you mentioned ... But forget it, just installed a Win 10 VirtualBox image with SQL Server 2014 – Tom Zayouna Dec 30 '16 at 10:27
  • Why 2014? The latest version is 2016 SP1, which comes with Columnstores, compression, encryption, partitions, in-memory DB etc even in Express. That means that you get almost all features even in the smallest edition. Installing an old version is like burning money. And Developer Edition is free. – Panagiotis Kanavos Dec 30 '16 at 10:38
  • Anyway, what is `countries`? Checking the docs it looks like there are some v.Next parameters like `DATASOURCE`, `FILEFORMAT_DATASOURCE` etc that point to files stored in Azure Blob storage. Perhaps there is a bug and you need to specify `DATASOURCE=''`? – Panagiotis Kanavos Dec 30 '16 at 10:41
  • I'm using docker image mcr.microsoft.com/mssql/server:2017-latest. It doesn't seem to support bulk insert. Can you clarify which SQL Server version are you using? – Gqqnbig Nov 19 '19 at 00:28
  • Using `EXEC xp_dirtree '/', 1, 2` in my Dockerized mssql 2019 database I see files I cannot find while on the Docker image BUT I'm able to get `BULK_INSERT` to work with them (e.g., `SELECT * FROM OPENROWSET(BULK N'/sqlservr.computepool.hiv', SINGLE_BLOB) AS value`). So if you are able to figure out where the database _thinks_ the root directory is _and_ get files there it may work. I would love to know :) – eebbesen Feb 05 '20 at 18:29

4 Answers4

5

I had a lot of troubles with BULK INSERT on linux, where a valid path name was rejected:

# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM '/tmp/data.txt'"
Msg 12703, Level 16, State 1, Server mssql, Line 1
Referenced external data source "(null)" not found.

# ll /tmp/data.txt
-rw-r--r-- 1 root root 30M Feb 20 02:40 /tmp/data.txt

And the suggested workaround to replace slashes with backslashes and prepending C:\ didn't work:

# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM 'C:\\tmp\\data.txt'"
Msg 12703, Level 16, State 1, Server mssql, Line 1
Referenced external data source "(null)" not found.

So, I investigated a bit, and by running strace on the daemon I found this:

# strace -fp 3198 -e open
strace: Process 3198 attached with 175 threads
[pid  3202] open("/proc/self/status", O_RDONLY) = 170
[pid  3202] open("/proc/meminfo", O_RDONLY) = 170
[pid  3321] open("/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 175
[pid  3321] open("/tmp/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 175
[pid  3321] open("/tmp/data.txt", O_RDONLY) = 175
[pid  3321] open("/tmp/data.txt", O_RDONLY|O_DIRECT) = -1 EINVAL (Invalid argument)

The open() fails because O_DIRECT is not supported by the tmpfs filesystem. So I moved the file in the root and I made it world accessible:

# ll /data.txt
-rw-rw-rw- 1 root root 30M Feb 20 02:28 /data.txt

# ll /data.txt
-rw-rw-rw- 1 root root 30M Feb 20 02:28 /data.txt
# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM '/data.txt'"
Msg 4860, Level 16, State 1, Server mssql, Line 1
Cannot bulk load. The file "/data.txt" does not exist or you don't have file access rights.

But this time the server doesn't even tries to access the file. Moving the file in a world accessible directory other than the root, fixed it:

# sqlcmd -S localhost -d leak -U sa -Q "BULK INSERT leak FROM '/media/data.txt'"

(1000000 rows affected)
teknoraver
  • 906
  • 8
  • 6
3

This is a bug in SQL Server on Linux with how we handle paths. We are still working through getting everything in SQL Server to handle Linux path syntax. I've filed a bug on this and we'll get it fixed. In the meantime, you can specify the path as C:\import\file.tsv where C:\ is a placeholder for the root of the Linux file system (i.e. '/') and the slashes are just reversed. I tested this exact scenario in a RHEL VM and it works fine. Pro tip: This C:\ trick will work anywhere that paths are passed in T-SQL so if you run into this kind of an issue for something else requiring paths, give it a try.

Bug #9380471 for Microsoft-internal reference.

  • 10
    This does not work for me (using the `microsoft/mssql-server-linux` docker container and mapping the file into the volume). Is there any other way to get it working? – spinningarrow Mar 22 '17 at 09:51
  • 2
    This is still broken in docker. Neither the C:\ style path nor the *nix style / path work. It's pretty unbelievable that this has been left broken for so long. – Paul Wheeler Apr 30 '18 at 01:36
  • 1
    More details: I'm using docker for mac. I'm using the microsoft/mssql-server-linux:2017-latest docker image. I'm attaching volumes by passing `-v /host-path/AdventureWorks:/AdventureWorks` to `docker run`. I've verified that the files I'm trying to load exist within the container by exec'ing bash on the container. – Paul Wheeler Apr 30 '18 at 19:43
  • 1
    still broken. I tried: /data.txt C:\data.txt \data.txt with no success. – teknoraver Feb 20 '19 at 01:45
1

I have an MSSQL running on Ubuntu 18 and solved the same issue by moving my CSV file into the root of the Ubuntu (/). It finally saw the CSV file and works fine.

BULK INSERT Students FROM '/student.csv' WITH ( FORMAT='CSV');
Aaron
  • 349
  • 5
  • 18
1

I came into this problem when running the official SQL Server 2019 docker image on my Debian 10 PC. I found two solutions that do not require backslash substitution.

Solution 1: Mount a Shared Directory

Mount a shared directory upon running the container. Then copy or move the CSV file there so it can be accessed by SQL Server through the path as perceived by the docker image.

For example, I started the SQL Server container with this script:

#!/bin/bash
sudo docker run \
    -d \
    -e 'ACCEPT_EULA=Y' \
    -e 'SA_PASSWORD=my_password' \
    -p 1433:1433 \
    --name mssql2019 \
    -v /some/local/dir:/mnt/dir \
    mcr.microsoft.com/mssql/server:2019-latest   

Now copy or move data.csv into /some/local/dir (perceived by the container as /mnt/dir). It can be imported by the following command:

-- First create table with the correct schema (not shown here)

bulk insert TargetTable
from '/mnt/dir/data.csv'  /* path SEEN BY THE CONTAINER */
with (FORMAT = 'CSV', FIRSTROW = 2);  /* a common CSV with header row*/

Limitations

  • Mounting can only be done upon running the container. If changes made to the SQL Server container is to be kept, then one must stop the container, commit the changes and rerun the new image. This will create a new image so be aware of the I/O cost.
  • The shared directory cannot be tmpfs (e.g. ramdisk). ntfs-3g and ext4 are fine.
  • Symlinking the file into the shared directory won't work.

Solution 2: Copy CSV into Container

The file can also be copied into the container via docker cp. For example:

# mssql2019 = your container name 
$ sudo docker cp /path/to/file.csv mssql2019:/home/file.csv

Then the same bulk insert command in Solution 1 applies (except that the import path is now /home/file.csv).

Note that one can launch a shell into the container by

$ (sudo) docker exec -u 0 -it mssql2019 bash

In this way, one can easily create a custom folder before copying or verify that the file is indeed copied.

Bill Huang
  • 4,491
  • 2
  • 13
  • 31