82

I'm trying to GET an URL of the following format using requests.get() in python:

http://api.example.com/export/?format=json&key=site:dummy+type:example+group:wheel

#!/usr/local/bin/python

import requests

print(requests.__versiom__)
url = 'http://api.example.com/export/'
payload = {'format': 'json', 'key': 'site:dummy+type:example+group:wheel'}
r = requests.get(url, params=payload)
print(r.url)

However, the URL gets percent encoded and I don't get the expected response.

2.2.1
http://api.example.com/export/?key=site%3Adummy%2Btype%3Aexample%2Bgroup%3Awheel&format=json

This works if I pass the URL directly:

url = http://api.example.com/export/?format=json&key=site:dummy+type:example+group:wheel
r = requests.get(url)

Is there some way to pass the the parameters in their original form - without percent encoding?

Thanks!

Satyen Rai
  • 1,493
  • 1
  • 12
  • 19
  • 1
    It is a [standard](http://en.wikipedia.org/wiki/Percent-encoding). What is wrong with it? – alecxe May 06 '14 at 13:58
  • 6
    @alecxe: The site I'm querying doesn't seem to work with percent encoded URLs and I get unexpected response. – Satyen Rai May 06 '14 at 14:18
  • 3
    I got this problem with Google Maps API and comma in `location=43.585278,39.720278` and I didn't find solution. – furas May 06 '14 at 14:28
  • Ran into the same problem as the OP. A json api I am "forced" to use doesn't like the url passed to it to be encoded. Had to build a string. Didn't know about the safe=':+' option listed below. – waltmagic Dec 18 '22 at 07:41

7 Answers7

94

It is not good solution but you can use directly string:

r = requests.get(url, params='format=json&key=site:dummy+type:example+group:wheel')

BTW:

Code which convert payload to this string

payload = {
    'format': 'json', 
    'key': 'site:dummy+type:example+group:wheel'
}

payload_str = "&".join("%s=%s" % (k,v) for k,v in payload.items())
# 'format=json&key=site:dummy+type:example+group:wheel'

r = requests.get(url, params=payload_str)

EDIT (2020):

You can also use urllib.parse.urlencode(...) with parameter safe=':+' to create string without converting chars :+ .

As I know requests also use urllib.parse.urlencode(...) for this but without safe=.

import requests
import urllib.parse

payload = {
    'format': 'json', 
    'key': 'site:dummy+type:example+group:wheel'
}

payload_str = urllib.parse.urlencode(payload, safe=':+')
# 'format=json&key=site:dummy+type:example+group:wheel'

url = 'https://httpbin.org/get'

r = requests.get(url, params=payload_str)

print(r.text)

I used page https://httpbin.org/get to test it.

furas
  • 134,197
  • 12
  • 106
  • 148
  • Thanks, That's what I'm currently doing to make it work. I'm looking for a solution similar to the (obsolete) one described [here](https://groups.google.com/forum/#!topic/scraperwiki/spEFwwzxrQA). Thanks anyway! – Satyen Rai May 06 '14 at 15:18
  • I was looking for better solution (similar to the obsolete one) in requests source code but I didn't find it. – furas May 06 '14 at 15:28
  • 1
    worked for me. seemingly not great, but gets the job done. i thought there might be some easier solution by adjusting the encoding within the `requests` object. – ryantuck Feb 18 '15 at 22:30
  • I use "%XX" where XX are hex digits. Sending strings for params works until I try to send something larger than 2F, at which point I get an "Invalid control character" error – retsigam Aug 15 '18 at 22:19
  • `urllib.parse.urlencode` is not ignoring curly braces during parsing. `self.response = requests.get(SteamQuery.queries[self.query_type], params=urllib.parse.urlencode(self.query_params,safe=":{}[]"))` `input_json=%7Bappids_filter:[892970]%7D` – user1023102 Feb 26 '21 at 03:10
  • For me, setting a string as params causes an error due to the ```&``` symbols. I need to use ```requests.get()``` with the base url, some parameters, and some (user, pw) argument to ```auth```. The result is ```Warning {"success":false,"code":400,"message":"Parameter ... of value ... violated a constraint... value, does not match requirements ([0-9]+,?)+``` – Manuel Popp May 10 '23 at 09:45
14

In case someone else comes across this in the future, you can subclass requests.Session, override the send method, and alter the raw url, to fix percent encodings and the like. Corrections to the below are welcome.

import requests, urllib

class NoQuotedCommasSession(requests.Session):
    def send(self, *a, **kw):
        # a[0] is prepared request
        a[0].url = a[0].url.replace(urllib.parse.quote(","), ",")
        return requests.Session.send(self, *a, **kw)

s = NoQuotedCommasSession()
s.get("http://somesite.com/an,url,with,commas,that,won't,be,encoded.")
Tim Ludwinski
  • 2,704
  • 30
  • 34
  • I know this wasn't in the OP's question but this doesn't work for the path portion of the URL (at the time of this comment). – Tim Ludwinski Apr 12 '21 at 20:20
  • 2
    In modern versions of requests, you actually also are gonna have to patch `urllib3`; it performs its own encoding. `requests.urllib3.util.url.PATH_CHARS.add(',')`. This starts to get into "more hacky than it's probably worth" territory, but if you _REALLY_ need it... here it is – ollien Nov 05 '21 at 15:31
13

The answers above didn't work for me.

I was trying to do a get request where the parameter contained a pipe, but python requests would also percent encode the pipe. So instead i used urlopen:

# python3
from urllib.request import urlopen

base_url = 'http://www.example.com/search?'
query = 'date_range=2017-01-01|2017-03-01'
url = base_url + query

response = urlopen(url)
data = response.read()
# response data valid

print(response.url)
# output: 'http://www.example.com/search?date_range=2017-01-01|2017-03-01'
kujosHeist
  • 810
  • 1
  • 11
  • 25
12

The solution, as designed, is to pass the URL directly.

Kenneth Reitz
  • 8,559
  • 4
  • 31
  • 34
  • 1
    The idea behind using the payload dictionary to keep the actual code somewhat cleaner - as suggested [here](http://docs.python-requests.org/en/latest/user/quickstart/#passing-parameters-in-urls). – Satyen Rai May 06 '14 at 15:12
  • 8
    I found this old comment by @Darkstar kind of funny as the answer he's responding to is by the author of `requests`. – Dustin Wyatt Jul 14 '16 at 16:53
  • @DustinWyatt Wow! I don't know how I missed that! – Satyen Rai Jul 14 '16 at 16:57
  • 1
    This is the most straightforward and verified working solution. Ditch the payload dictionary and slap all those parameters right into the url. – Kurt Oct 22 '20 at 03:12
  • 5
    No this will not work, `requests` of latest version will encode the characters even if you pass the URL directly. – oeter Nov 25 '21 at 12:36
  • @oeter: I don't find this to be true. See [toy example](https://imgur.com/a/sahmIWD). – Hendy Apr 11 '23 at 00:57
4

All above solutions don't seem to work anymore from requests version 2.26 on. The suggested solution from the GitHub repo seems to be using a work around with a PreparedRequest.

The following worked for me. Make sure the URL is resolvable, so don't use 'this-is-not-a-domain.com'.

import requests

base_url = 'https://www.example.com/search'
query = '?format=json&key=site:dummy+type:example+group:wheel'

s = requests.Session()
req = requests.Request('GET', base_url)
p = req.prepare()
p.url += query
resp = s.send(p)
print(resp.request.url)

Source: https://github.com/psf/requests/issues/5964#issuecomment-949013046

LGG
  • 101
  • 1
  • 5
0

Please have a look at the 1st option in this github link. You can ignore the urlibpart which means prep.url = url instead of prep.url = url + qry

Sandeep Kanabar
  • 1,264
  • 17
  • 33
0

Requests and Urllib3 don't allow raw URLs anymore!

  • Since those libraries enforce RFC3986 url encoding implicitly.
  • Previously available workarounds are now defunct:
    • Even the prepped request workaround doesn't work anymore!

=> To perform raw path requests another library must be used like "http".

Raw Url Request Wrapper

  • Wrapping the http library provides similar request methods but will not implicitly modify urls, thus allows to send urls as is
  • Code:
import http.client
from urllib.parse import urlparse

class HttpWrapper:

    def parse_url(self, url):
        p_url = urlparse(url)
        host = p_url.hostname
        port = p_url.port
        path = p_url.path
        assert host
        assert port
        
        return host, port, path

    def request(self, url, method, headers={}, data=None):
        # Parse url
        host, port, path = self.parse_url(url)

        # Perform request
        try:
            conn = http.client.HTTPConnection(host, port)
            conn.request(method, path, data, headers)
            response = conn.getresponse()
            return response
        finally:
            conn.close()
    
    def get(self, url, headers):
        return self.request(url, "GET", headers)

    def post(self, url, headers, data):
        return self.request(url, "POST", headers, data)

    def put(self, url, headers, data):
        return self.request(url, "PUT", headers, data)
    
    def delete(self, url, headers):
        return self.request(url, "DELETE", headers)
SaturnIC
  • 17
  • 4