28

Using Elasticsearch 5.5,getting the following error while posting this bulk request, unable to figure out what is wrong with the request.

"type": "illegal_argument_exception",
"reason": "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"

POST http://localhost:9200/access_log_index/access_log/_bulk

{ "index":{ "_id":11} }
{  
   "id":11,
   "tenant_id":682,
   "tenant_name":"kcc",
   "user.user_name":"k0772251",
   "access_date":"20170821",
   "access_time":"02:41:44.123+01:30",
   "operation_type":"launch_service",
   "remote_host":"qlsso.quicklaunchsso.com",
   "user_agent":"Mozilla/5.0 (Linux; Android 7.0; LGLS775 Build/NRD90U) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Mobile Safari/537.36",
   "browser":"",
   "device":"",
   "application.application_id":1846,
   "application.application_name":"Desire2Learn",
   "geoip.ip":"192.95.18.163",
   "geoip.country_code":"US",
   "geoip.country_name":"United States",
   "geoip.region_code":"NJ",
   "geoip.region_name":"New Jersey",
   "geoip.city":"Newark",
   "geoip.zip_code":7102,
   "geoip.time_zone":"America/New_York",
   "geoip.latitude":40.7355,
   "geoip.longitude":-74.1741,
   "geoip.metro_code":501
}
{ "index":{"_id":12} }
{  
   "id":12,
   "tenant_id":682,
   "tenant_name":"kcc",
   "user.user_name":"k0772251",
   "access_date":"20170821",
   "access_time":"02:50:44.123+01:30",
   "operation_type":"launch_service",
   "remote_host":"qlsso.quicklaunchsso.com",
   "user_agent":"Mozilla/5.0 (Linux; Android 7.0; LGLS775 Build/NRD90U) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Mobile Safari/537.36",
   "browser":"",
   "device":"",
   "application.application_id":2341,
   "application.application_name":"Gmail",
   "geoip.ip":"192.95.18.163",
   "geoip.country_code":"US",
   "geoip.country_name":"United States",
   "geoip.region_code":"NJ",
   "geoip.region_name":"New Jersey",
   "geoip.city":"Newark",
   "geoip.zip_code":7102,
   "geoip.time_zone":"America/New_York",
   "geoip.latitude":40.7355,
   "geoip.longitude":-74.1741,
   "geoip.metro_code":501
}
halfer
  • 19,824
  • 17
  • 99
  • 186
  • 3
    Your documents must be on a single line, no newlines are allowed within them. – Val Aug 21 '17 at 08:20
  • Please read [Under what circumstances may I add “urgent” or other similar phrases to my question, in order to obtain faster answers?](//meta.stackoverflow.com/q/326569) - the summary is that this is not an ideal way to address volunteers, and is probably counterproductive to obtaining answers. Please refrain from adding this to your questions. – halfer Aug 21 '17 at 10:17

4 Answers4

48

Your resource objects have to be specified on a single line like this

post /test322/type/_bulk
{ "index": {} }
{ "name": "Test1", "data": "This is my test data" }
{ "index": {} }
{ "name": "Test2", "data": "This is my test data2" }

Which seems really stupid and unintuitive I know since resources don't have to be on a single line when you create them using PUT or POST for non-bulk operations.

Neutrino
  • 8,496
  • 4
  • 57
  • 83
  • 2
    This saved me a few hours at least.. Thank you ;) – Rahul Ranjan Sep 24 '18 at 07:22
  • 6
    Also, it expects a line end at the last. – anand Apr 17 '19 at 21:26
  • 2
    That doesn't match an actual [example on Elastic](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/6.x/bulk_examples.html), but then their documentation is — in places — not optimal. – Wayne Smallman Jul 27 '19 at 14:39
  • 2
    This is such a bizarre API, the whole elasticsearch API seems to be written by data scientist rather than programmers... A less intuitive api is hard to find I think... – Operator Aug 29 '21 at 16:14
  • @Operator agree about bizarre API - disagree about data scientists – jtlz2 Jun 08 '22 at 11:53
  • 2
    @Operator This particular API simply supports `ndjson` rather than straight `json` - see https://www.ndjson.org – jtlz2 Jun 09 '22 at 07:43
3

The following lines' format worked for me very well: Action, metadata, resource

Note: Action should be CREATE to add a resource to the dataset and resource should be written inline, NOT new line.

  POST http://localhost:9200/access_log_index/access_log/_bulk
  { "create" : { "_index" : "test", "_type" : "_doc", "_id" : "11" } }
  {  "id":11, "tenant_id":682 , ... }
Enayat
  • 3,904
  • 1
  • 33
  • 47
2

You need to follow bulk format to successfully execute this. it expects the following JSON structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

For further reference, see this link https://www.elastic.co/guide/en/elasticsearch/reference/6.2/docs-bulk.html

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
1

I was getting below error when I was manipulating the data set before pushing it to the elastic search(sometimes in serverless environment you can not be certain on order of the events received)

 {
      "type": "illegal_argument_exception",
      "reason": "Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_BOOLEAN]"
 }

This happens when you modify the sequence of the events that elastic search bulk API expects. Like, when you use 'doc_as_upsert': true , the you should have the data in b

{
    "update": {
        "_index": "index_name",
        "_id": "id1234"
    }
}
,
{
    "doc_as_upsert": true,
    "doc": {
        "id": "id1234",
        ...otherFields
    }
}

You can not skip the update object in this case.

I hope this will be helpful for those who manipulate the dataset before it goes to ES

rushabh_trivedi
  • 109
  • 1
  • 4