10

When a mutli-word search term is entered in ebay, the resultant URL looks something like (for example "demarini cf5 cf12"):

http://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=demarini%20cf5%20cfc12

I wish to construct this URL in Python so it can be accessed directly. So it's case of concatenating the base URL:

http://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw=

... with the search term. Right now. I am adding the %20 for the spaces explicately thus:

baseUrl = 'http://www.ebay.com/sch/i.html?_from=R40&_sacat=0&_nkw='
searchTerm = 'demarini cf5 cf12'
searchTerm = ('%20').join(searchTerm.split(' '))
finalUrl = baseUrl + searchTerm

What is a more formal way of doing this in Python? I believe the name for this sort of task is URL encoding?

Pyderman
  • 14,809
  • 13
  • 61
  • 106
  • 1
    You probably want `urllib.quote()`. – larsks Sep 24 '15 at 13:11
  • 1
    The *proper* way of encoding a space in the *query string* of a URL is the `+` sign. See [Wikipedia](https://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type) and the [HTML specification](http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1). As such `urllib.quote_plus()` should be used instead when encoding just one key or value, or use `urllib.urlencode()` when you have a dictionary or sequence of key-value pairs. – Martijn Pieters Sep 24 '15 at 13:14
  • @MartijnPieters Thanks for the comprehensive answer as always, Martijn. – Pyderman Sep 24 '15 at 13:26

1 Answers1

16

Use urllib library

import urllib
finalurl = baseUrl + urllib.parse.quote(searchterm)

you can use quote_plus() to add + insted of %20 to undo this use

urllib.parse.unquote(str)

In Python 2, use urllib.quote() and urllib.unquote() respectively

Dima Lituiev
  • 12,544
  • 10
  • 41
  • 58
Jaysheel Utekar
  • 1,171
  • 1
  • 19
  • 37