13

Valid JSON can naturally have the backslash character: \. When you insert data in a SQL statement like so:

sidharth=# create temp table foo(data json);
CREATE TABLE
sidharth=# insert into foo values( '{"foo":"bar", "bam": "{\"mary\": \"had a lamb\"}" }');
INSERT 0 1

sidharth=# select * from foo;

data                         
\-----------------------------------------------------

{"foo":"bar", "bam": "{\"mary\": \"had a lamb\"}" }
(1 row)

Things work fine.

But if I copy the JSON to a file and run the copy command I get:

sidharth=# \copy foo from './tests/foo' (format text); 


ERROR:  invalid input syntax for type json
DETAIL:  Token "mary" is invalid.
CONTEXT:  JSON data, line 1: {"foo":"bar", "bam": "{"mary...
COPY foo, line 1, column data: "{"foo":"bar", "bam": "{"mary": "had a lamb"}" }"

Seems like postgres is not processing the backslashes. I think because of http://www.postgresql.org/docs/8.3/interactive/sql-syntax-lexical.html and it I am forced to use double backslash. And that works, i.e. when the file contents are:

{"foo":"bar", "bam": "{\\"mary\\": \\"had a lamb\\"}" }  

The copy command works. But is it correct to expect special treatment for json data types because afterall above is not a valid json.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
Sid
  • 420
  • 1
  • 6
  • 11
  • See https://stackoverflow.com/questions/52334512/postgres-copy-with-command-strips-escapes-from-json-file-also-loads-each-line-a/52355439#52355439. I think this explains how to solve this.. – mark d drake Oct 01 '18 at 03:49
  • Does this answer your question? [How can I import a JSON file into PostgreSQL?](https://stackoverflow.com/questions/39224382/how-can-i-import-a-json-file-into-postgresql) – tripleee Jan 07 '21 at 08:04

2 Answers2

22

http://adpgtech.blogspot.ru/2014/09/importing-json-data.html

copy the_table(jsonfield) 
from '/path/to/jsondata' 
csv quote e'\x01' delimiter e'\x02';
Vlad Purga
  • 331
  • 2
  • 3
  • Please provide more context for a link you post in an answer, and for the code snippet. [How to Answer](http://stackoverflow.com/help/how-to-answer) – AgataB Sep 09 '16 at 10:15
  • 2
    This is, at the time of writing, the correct answer to the OP's question: how to read json from the filesystem *without* having to apply special escape syntax (thus rendering the json invalid). Unfortunately most users will pass over the answer due to downvotes and low SO rep. Answer could be better but it was clearly just a drive by from a non-SO user... – virtualeyes Oct 10 '16 at 06:05
6

PostgreSQL's default bulk load format, text, is a tab separated markup. It requires backslashes to be escaped because they have special meaning for (e.g.) the \N null placeholder.

Observe what PostgreSQL generates:

regress=> COPY foo TO stdout;
{"foo":"bar", "bam": "{\\"mary\\": \\"had a lamb\\"}" }

This isn't a special case for json at all, it's true of any string. Consider, for example, that a string - including json - might contain embedded tabs. Those must be escaped to prevent them from being seen as another field.

You'll need to generate your input data properly escaped. Rather than trying to use the PostgreSQL specific text format, it'll generally be easier to use format csv and use a tool that writes correct CSV, with the escaping done for you on writing.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • I don't understand the answer. Specifically why the escaping is not done by PostgreSQL? It sounds like PostgreSQL accepts a tab separated file (that could contain JSON within), rather than a purely JSON file. – Chris Stryczynski Jul 09 '17 at 13:01
  • @ChrisStryczynski That's correct. It's TSV. `COPY` isn't used for loading single fields, it's for loading *tables*. – Craig Ringer Jul 10 '17 at 01:28