0

In the crawler I am trying to fetch the URL passed to the crawler in start_urls parameters.

Basic code is like :

class BasicSpider(scrapy.Spider):
    name = 'basic'
    star_urls = [https://abc/NachfA¼lltinte-Permanent]

    def parse(self, response):
        if response.status == 200:
            current_url_http_code = response.status
            current_url = response.request.url
            print(current_url)

The output of the current_url is https://abc/Nachf%C3%83%C2%BClltinte-Permanent

Some how I want to get the https://abc/NachfA¼lltinte-Permanent as well as https://abc/Nachf%C3%83%C2%BClltinte-Permanent

1 Answers1

1

What you need is urldecode/urlencode/quote/unquote. Overall your question is answered here. But there's still headache with UTF-8 and Python2 byte strings.

Michael Savchenko
  • 1,445
  • 1
  • 9
  • 13