5

I have a very strange problem with a simple HTTP Get Request in Golang.

Every request in Golang to https://www.alltron.ch/json/searchSuggestion?searchTerm=notebook needs about 6-8 seconds (!)

If same request fired in Chrome, with Postman or with Powershell it needs less than a second.

Does somebody has a clue why this happens?

My Code:

package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
)

func main() {
    client := &http.Client{}

    req, _ := http.NewRequest("GET", "https://www.alltron.ch/json/searchSuggestion?searchTerm=notebook", nil)

    response, err := client.Do(req)
    if err != nil && response == nil {
        log.Fatalf("Error on request. %v", err)
    }
    defer response.Body.Close()

    body, err := ioutil.ReadAll(response.Body)
    if err != nil {
        log.Fatalf("Couldn't get response body. %v", err)
    }

    fmt.Print(string(body))
}
pitw
  • 71
  • 1
  • 3
  • 1
    Your code does not show any time measurements. How did you determine that it takes that long? Did you try setting the appropriate headers (e.g. Accept, User-Agent, etc.)? – Volker Jan 25 '19 at 11:08
  • 2
    You can use the [httptrace package](https://golang.org/pkg/net/http/httptrace/) to figure out which part of the request is slow. – Peter Jan 25 '19 at 11:18
  • 1
    Your request from Chrome could be handled from a cache. Make sure you have [disabled cache](https://stackoverflow.com/questions/5690269/disabling-chrome-cache-for-website-development) before you make a comparison. – zdebra Jan 25 '19 at 11:21
  • 2
    Just to add some more detail: a trace in wireshark clearly shows that the problem is the server answering only after some time. Also, the same problems happens with curl and perl LWP::Simple. Given that this site is behind Akamai CDN and Akamai is offering anti-bot measures my guess is that this is an explicit slowdown done by the site when it detects a client which does not look like a typical browser. This is likely done as protection against automated scraping of information. – Steffen Ullrich Jan 25 '19 at 11:43

2 Answers2

10

The site you are trying to access is behind the Akamai CDN:

$ dig www.alltron.ch 
...
www.alltron.ch.         152     IN      CNAME   competec.botmanager.edgekey.net.
competec.botmanager.edgekey.net. 7052 IN CNAME  e9179.f.akamaiedge.net.
e9179.f.akamaiedge.net. 162     IN      A       2.20.176.40

Akamai offers its customers a detection of web clients which are not browsers so that the customers can keep bots away or slowing bots down.

As can be seen from Strange CURL issue with a particular website SSL certificate and Scraping attempts getting 403 error this kind of detection mainly cares about having a Accept-Language header, having a Connection header with the value Keep-Alive and having a User-Agent which matches Mozilla/....

This means the following code changes result in an immediate response:

req, _ := http.NewRequest("GET", "https://www.alltron.ch/json/searchSuggestion?searchTerm=notebook", nil)
req.Header.Set("Connection","Keep-Alive")
req.Header.Set("Accept-Language","en-US")
req.Header.Set("User-Agent","Mozilla/5.0")

Still, the site obviously does not like bots and you should adhere to these wishes and not stress the site too much (like doing lots of information scraping). And, the bot detection done by Akamai might change without notice, i.e. even if this code fixes the problem now it might no longer work in the future. Such changes will be especially true if many clients bypass the bot detection.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
-2

try to disable cache in your chrome and compare to golang

  • Thanks for the ideas. The solution is related to the proposition from @Volker The relevant header was: Accept ==> text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8 – pitw Jan 25 '19 at 15:09