1

I want to use the Python 3 module urllib to access an Elasticsearch database at localhost:9200. My script gets a valid request (generated by Kibana) piped to STDIN in JSON format.

Here is what I did:

import json
import sys
import urllib.parse
import urllib.request

er = json.load(sys.stdin)
data = urllib.parse.urlencode(er)
data = data.encode('ascii')
uri = urllib.request.Request('http://localhost:9200/_search', data)
with urllib.request.urlopen(uri) as repsonse:
    response.read()

(I understand that my repsonse.read() doesn't make much sense by itself but I just wanted to keep it simple.)

When I execute the script, I get an

HTTP Error 400: Bad request

I am very sure that the JSON data I'm piping to the script is correct, since I had it printed and fed it via curl to Elasticsearch, and got back the documents I expected to get back.

Any ideas where I went wrong? Am I using urllib correctly? Do I maybe mess up the JSON data in the urlencode line? Am I querying Elasticsearch correctly?

Thanks for your help.

eins6180
  • 163
  • 5
  • you probably need to specify a content type... see here: https://docs.python.org/3/library/urllib.request.html#urllib.request.Request .. if you don't specify a content-type, it will default to application/x-www-form-urlencoded , which isn't what you sent. If you don't mind using an external library, requests (http://docs.python-requests.org/en/master/) makes this a little easier... – Corley Brigman Jul 06 '17 at 17:12
  • Could you provide an example of the data object, that you pass to ElasticSearch? Btw I use the requests library for query to ES. It's super straightforward. Just curious - why use Kibana to create the payload (data) and what do you intend to do with the response once you get pass the 400? – jlaur Jul 06 '17 at 20:37
  • @CorleyBrigman: I wish I could use the request library. Unfortunately, I am working in high security environment and they are very reluctant to install anything more then what is strictly needed. – eins6180 Jul 07 '17 at 03:22
  • @jlaur: The data is normally not created with Kibana, I just did it for testing purposes. And I don't know what they plan to do with the extracted data (my goal is to simple extract it from shell via this script, process it a little further, and that's it). – eins6180 Jul 07 '17 at 03:26
  • I would start with adding `headers={'Content-Type': 'application/json'}` to your request then... I think the error is just that you are passing json, but because you passed data with no header, it assigns the content type to 'application/x-www-form-urlencoded' instead, and it doesn't match. – Corley Brigman Jul 07 '17 at 14:39
  • Take a look at the code examples on Elasticsearch Documentation for various operations. There you will find a`COPY AS CURL` option below each code sample. That will give you the CURL request for that operation which will highlight the right headers needed for that request. I suggest doing this for whichever operation you need to do and replicate using the requests/urllib library(requests is much better. RESTFUL interactions built in). Or just convince them to use Python Elasticsearch client(less pain while implementing if you can install this). – Divij Sehgal Jul 10 '17 at 09:30
  • https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html - Copy/Paste `COPY AS CURL` on a notepad to see all that is needed for the request. – Divij Sehgal Jul 10 '17 at 09:31

2 Answers2

0

With requests you can do one of two things

1) Either you create the string representation of the json object yourself and send it off like so:

payload = {'param': 'value'}
response = requests.post(url, data=json.dumps(payload))

2) Or you have requests do it for you like so:

payload = {'param': 'value'}
response = requests.post(url, json = payload)

So depending on what actually comes out of the sys.stdin call (probably - as Kibana would be sending that if the target was ElasticSearch - a string representation of a json object == equivalent of doing json.dumps on a dictionary), but you might have to adjust a bit depending on the output of sys.stdin.

My guess is that your code could work by just doing so:

import sys
import requests
payload  = sys.stdin
response = requests.post('http://localhost:9200/_search', data=payload)

And if you then want to do some work with it in Python, requests has a built in support for this too. You just call this:

json_response = response.json()

Hope this helps you on the right track. For further reading om json.dumps/loads - this answer has some good stuff on it.

jlaur
  • 740
  • 5
  • 13
  • Thanks! If I fail to get my script working with urllib, I will try to convince them to install requests. But I'm very sceptical that they will follow this suggestion. – eins6180 Jul 07 '17 at 03:28
  • Aaah. Check out this SO question on how to do a POST request with json payload using urllib then: https://stackoverflow.com/a/4998300/8240959 and this one https://stackoverflow.com/a/9746432/8240959 – jlaur Jul 07 '17 at 04:37
0

For anyone who doesn't want to use requests (for example if you're using IronPython where its not supported):

import urllib2
import json
req = urllib2.Request(url, json.dumps(data), headers={'Content-Type': 'application/json'})
response = urllib2.urlopen(req)

Where 'url' can be something like this (example below is search in index):

http://<elasticsearch-ip>:9200/<index-name>/_search/
Ronen Ness
  • 9,923
  • 4
  • 33
  • 50