Import/Index a JSON file into Elasticsearch

Question

I am new to Elasticsearch and have been entering data manually up until this point. For example I've done something like this:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
    "user" : "kimchy",
    "post_date" : "2009-11-15T14:12:12",
    "message" : "trying out Elastic Search"
}'

I now have a .json file and I want to index this into Elasticsearch. I've tried something like this too, but no success:

curl -XPOST 'http://jfblouvmlxecs01:9200/test/test/1' -d lane.json

How do I import a .json file? Are there steps I need to take first to ensure the mapping is correct?

Possible duplicate of [is there any way to import a json file(contains 100 documents) in elasticsearch server.?](http://stackoverflow.com/questions/20646836/is-there-any-way-to-import-a-json-filecontains-100-documents-in-elasticsearch) — shailendra pathak, Dec 03 '15 at 17:57

score 101 · Accepted Answer · edited Jul 11 '19 at 06:51

101

The right command if you want to use a file with curl is this:

curl -XPOST 'http://jfblouvmlxecs01:9200/test/_doc/1' -d @lane.json

Elasticsearch is schemaless, therefore you don't necessarily need a mapping. If you send the json as it is and you use the default mapping, every field will be indexed and analyzed using the standard analyzer.

If you want to interact with Elasticsearch through the command line, you may want to have a look at the elasticshell which should be a little bit handier than curl.

2019-07-10: Should be noted that custom mapping types is deprecated and should not be used. I updated the type in the url above to make it easier to see which was the index and which was the type as having both named "test" was confusing.

edited Jul 11 '19 at 06:51

Community

1
1

answered Apr 10 '13 at 21:37

javanna

59,145
14
144
125

1

I does't work for me, when I type Your command the console does't provide any data. – Konrad Dec 03 '13 at 11:16
2

@Konrad you replaced `jfblouvmlxecs01` with `localhost`, right? – Ehtesh Choudhury Jul 29 '14 at 21:42
What does the '@' before the file name used for? – clwen Sep 04 '14 at 20:12
2

clwen - the "@" tells curl to load the data from the json file. – Oliver Oct 22 '14 at 20:18
1

hi i am also new in elastic search can anyone please gudie me where to store these .json files? – swaheed Nov 20 '14 at 11:00
You probably at least want to use scan/scroll you can lock the `snapshot` of your data as you export – Evan Aug 14 '15 at 12:37
2

Where to store json file? – AV94 Oct 30 '15 at 06:05
Store it in your repo and @yogesh-darji when running curl, put it anywhere the system where curl running so it can read the file. It has to read it and send the contents to the URL. That is curl's job. – tgkprog Jan 17 '17 at 11:44
How to tell which name is after the port of elasticsearch? why do you put `test/test1`? – HomeMade Mar 12 '17 at 15:38
@HomeMade First `test` is the index and second `test` is the type, finally followed by an `id` 1. See this : https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#docs-index_ . – Shubham A. Apr 20 '17 at 07:44
hi, i get a respone in json format from a url now i want to store that json from remote url in to elasticsearch how can this be done? please help – Jasbin karki Oct 30 '18 at 16:43
@all I have analyzer in my json file now i want to execute that file how can i do that. – Dhwanil Patel May 20 '20 at 05:43
In my windows machine curl store at, system32 and it don't allow to place my file there. So how can i setup full path where my file stored – Dhwanil Patel May 20 '20 at 05:44
When i execute this it gives body not found error, curl -H "Content-Type:application/json" -XPOST "http://localhost:9200/document/_setting?pretty" -d @Elasticsearch_Spacial_Character_Analyzer.json – Dhwanil Patel May 20 '20 at 05:45

score 29 · Answer 2 · edited Oct 16 '19 at 04:56

29

Per the current docs, https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html:

If you’re providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn’t preserve newlines.

Example:

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests

edited Oct 16 '19 at 04:56

Conrado

1,402
17
23

answered Nov 13 '14 at 00:36

KenH

500
5
4

1

Note that the _bulk load json file is not valid a valid json file; the syntax is provided in the _bulk API link. Also, you do not have to provide an _id as indicated in these examples; an auto-generated _id will be provided when _id is omitted. – Steve Tarver May 06 '17 at 17:26

score 20 · Answer 3 · answered May 01 '17 at 14:40

One thing I've not seen anyone mention: the JSON file must have one line specifying the index the next line belongs to, for every line of the "pure" JSON file.

I.E.

{"index":{"_index":"shakespeare","_type":"act","_id":0}}
{"line_id":1,"play_name":"Henry IV","speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}

Without that, nothing works, and it won't tell you why

score 19 · Answer 4 · answered Nov 18 '14 at 03:32

19

We made a little tool for this type of thing https://github.com/taskrabbit/elasticsearch-dump

answered Nov 18 '14 at 03:32

Evan

3,191
4
29
25

6

The given examples do not cover the question asked here. Will it work if we give the json file as an input and the elastic search url as the output? – jgr0 Jul 25 '17 at 07:10
I am using this to export the index into json. Thanks. – Krishna Chaitanya Gopaluni Nov 24 '20 at 23:00
1

Use the following command. `elasticdump --input=/path/to/file.json --output=http://'username:password'@localhost:9200/indexname --type=data`. Remove `'username:password@'` if you don't need. – Krishna Chaitanya Gopaluni Dec 01 '20 at 21:03

score 13 · Answer 5 · answered Mar 16 '18 at 20:29

13

I'm the author of elasticsearch_loader
I wrote ESL for this exact problem.

You can download it with pip:

pip install elasticsearch-loader

And then you will be able to load json files into elasticsearch by issuing:

elasticsearch_loader --index incidents --type incident json file1.json file2.json

answered Mar 16 '18 at 20:29

MosheZada

2,189
1
14
17

This is nice! It adds the mandatory `index` line before every document. – dr0i Apr 06 '18 at 14:59
2018-10-04 11:51:40.395741 ERROR attempt [1/1] got exception, it is a permanent data loss, no retry any more 2018-10-04 11:51:40.395741 WARN Chunk 0 got exception (ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10.0))) while processing – Chiel Oct 04 '18 at 09:52
Apart from the fact that it doesn't work, where do you specify the URL and port? – Chiel Oct 04 '18 at 09:53
You can visit the GitHub page or run `elasticsearch_loader --help` in order to view the full help message. You can specify the host:port with `--es-host http://hostname:port` – MosheZada Oct 07 '18 at 19:57
Nice. Except that `--type` becomes redundant as Elasticsearch removes types in 6 version https://www.elastic.co/guide/en/elasticsearch/reference/6.0/removal-of-types.html – Vlad T. Mar 20 '19 at 12:25
According to https://www.elastic.co/guide/en/elasticsearch/reference/6.7/removal-of-types.html Types will be deprecated in APIs in Elasticsearch 7.0.0, and completely removed in 8.0.0. I've added _doc as a default type, I will add tests for version 7 – MosheZada Mar 20 '19 at 18:52

score 10 · Answer 6 · answered May 05 '18 at 11:42

I just made sure that I am in the same directory as the json file and then simply ran this

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/product/default/_bulk?pretty --data-binary @product.json

So if you too make sure you are at the same directory and run it this way. Note: product/default/ in the command is something specific to my environment. you can omit it or replace it with whatever is relevant to you.

score 9 · Answer 7 · edited May 18 '16 at 17:50

9

Adding to KenH's answer

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests

You can replace @requests with @complete_path_to_json_file

Note: @is important before the file path

edited May 18 '16 at 17:50

filhit

2,084
1
21
34

answered May 18 '16 at 15:51

Ram Pratap

1,079
11
8

can u give some example for path. i am giving "@c:\accounts.json" and placing it there even then, its not able to locate it – Piyush Mittal Oct 08 '16 at 16:48
4

it should be @"c:\accounts.json" – Ram Pratap Oct 11 '16 at 20:42
1

add a header flag like so -H "Content-Type: application/json" – Shady Kip Nov 13 '20 at 07:20

score 6 · Answer 8 · answered Oct 12 '16 at 04:35

6

just get postman from https://www.getpostman.com/docs/environments give it the file location with /test/test/1/_bulk?pretty command.

answered Oct 12 '16 at 04:35

Piyush Mittal

1,860
1
21
39

2

{ "error": "no handler found for uri [/test/test/1/_bulk?pretty] and method [POST]" } – Chiel Oct 04 '18 at 09:49
{ "error": "Content-Type header [text/plain] is not supported", "status": 406 } – X. L May 27 '20 at 15:38

score 6 · Answer 9 · answered Jun 14 '17 at 05:33

You are using

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests

If 'requests' is a json file then you have to change this to

$ curl -s -XPOST localhost:9200/_bulk --data-binary @requests.json

Now before this, if your json file is not indexed, you have to insert an index line before each line inside the json file. You can do this with JQ. Refer below link: http://kevinmarsh.com/2014/10/23/using-jq-to-import-json-into-elasticsearch.html

Go to elasticsearch tutorials (example the shakespeare tutorial) and download the json file sample used and have a look at it. In front of each json object (each individual line) there is an index line. This is what you are looking for after using the jq command. This format is mandatory to use the bulk API, plain json files wont work.

score 3 · Answer 10 · answered Jun 08 '20 at 09:19

3

As of Elasticsearch 7.7, you have to specify the content type also:

curl -s -H "Content-Type: application/json" -XPOST localhost:9200/_bulk --data-binary @<absolute path to JSON file>

answered Jun 08 '20 at 09:19

thSoft

21,755
5
88
103

score 2 · Answer 11 · edited Jan 10 '19 at 21:01

2

I wrote some code to expose the Elasticsearch API via a Filesystem API.

It is good idea for clear export/import of data for example.

I created prototype elasticdriver. It is based on FUSE

edited Jan 10 '19 at 21:01

Eric Leschinski

146,994
96
417
335

answered Dec 14 '18 at 09:11

Yaroslav Gaponov

1,997
13
12

waseem khan · Answer 12 · 2020-10-01T13:20:57.397

If you are using the elastic search 7.7 or above version then follow below command.

curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk? pretty&refresh" --data-binary @"/Users/waseem.khan/waseem/elastic/account.json"
On above file path is /Users/waseem.khan/waseem/elastic/account.json.
If you are using elastic search 6.x version then you can use the below command.

curl -X POST localhost:9200/bank/_bulk?pretty&refresh --data-binary @"/Users/waseem.khan/waseem/elastic/account.json" -H 'Content-Type: application/json'

Note: Make sure in your .json file at the end you will add the one empty line otherwise you will be getting below exception.

"error" : {
"root_cause" : [
  {
    "type" : "illegal_argument_exception",
    "reason" : "The bulk request must be terminated by a newline [\n]"
  }
],
"type" : "illegal_argument_exception",
"reason" : "The bulk request must be terminated by a newline [\n]"
},
`enter code here`"status" : 400

score 0 · Answer 13 · answered Feb 24 '14 at 10:35

if you are using VirtualBox and UBUNTU in it or you are simply using UBUNTU then it can be useful

wget https://github.com/andrewvc/ee-datasets/archive/master.zip
sudo apt-get install unzip (only if unzip module is not installed)
unzip master.zip
cd ee-datasets
java -jar elastic-loader.jar http://localhost:9200 datasets/movie_db.eloader

score -1 · Answer 14 · answered Dec 09 '20 at 08:56

-1

If you want to import a json file into Elasticsearch and create an index, use this Python script.

import json
from elasticsearch import Elasticsearch

es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
i = 0
with open('el_dharan.json') as raw_data:
    json_docs = json.load(raw_data)
    for json_doc in json_docs:
            i = i + 1
            es.index(index='ind_dharan', doc_type='doc_dharan', id=i, body=json.dumps(json_doc))

answered Dec 09 '20 at 08:56

Mahan

371
1
4
11

highly not recommended with large number of documents. as it does one single insert request per document, this one is incredibly unperformant. – fraank Feb 22 '22 at 12:55

Import/Index a JSON file into Elasticsearch

14 Answers14

Linked