6

I am studying the Boost.Beast library. I try to make a request whose response is:

HTTP/1.1 301 Moved Permanently
Cache-Control: public
Content-Type: text/html; charset=UTF-8
Location: https://www.example.com/target/xxx/

Then, I try to make a request with this location field but I receive the bad request response.

How can I do the redirection? Is there an example?

This is my code:

boost::asio::io_service ios;
tcp::resolver resolver{ios};
tcp::socket socket{ios};
auto const lookup = resolver.resolve( tcp::resolver::query(host, port) );
boost::asio::connect(socket, lookup);

// Set up an HTTP GET request message
http::request<http::string_body> req{http::verb::get, target, 11};
req.set(http::field::host, host);
req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);

// Send the HTTP request to the remote host
http::write(socket, req);

// This buffer is used for reading and must be persisted
boost::beast::flat_buffer buffer;

// Declare a container to hold the response
http::response<http::dynamic_body> res;

// Receive the HTTP response
http::read(socket, buffer, res);

if( res.base().result_int() == 301 ) {
   req.set(http::field::location, res.base()["Location"]);
   http::write(socket, req);
   boost::beast::flat_buffer buffer1;
   http::read(socket, buffer1, res);
}
std::cout << req << std::endl;
std::cout << res << std::endl;

Thanks

sehe
  • 374,641
  • 47
  • 450
  • 633
Juan Solo
  • 359
  • 1
  • 5
  • 17

1 Answers1

8

When you redirect, you cannot just "replace" a location on the existing request. You cannot even use the same socket, except in the rare cases when the redirected target is on the same TCP endpoint.

Because the host name, protocol and path might have changed, you do have to parse the location, get the scheme, host, path parts. Then you must do proper host resolution again, and make sure to use the right host name in the Host header.

Here's a sample that shows requesting the Boost License at the "wrong" url http://boost.org/user/license.html, which will promptly redirect to http://www.boost.org/user/license.html.

NOTE I've used network::uri to do the URI parsing for us: https://github.com/reBass/uri

Demo

#include <iostream>
#include <boost/beast.hpp>
#include <boost/beast/http.hpp>
#include <network/uri.hpp>
#include <boost/asio.hpp>
#include <string>

using boost::asio::ip::tcp;
namespace http = boost::beast::http;

struct Requester {
    void do_request(std::string const& url) {
        network::uri u{url};
        auto const lookup = resolver_.resolve( tcp::resolver::query(u.host().to_string(), u.scheme().to_string()) );

        // Set up an HTTP GET request message
        tcp::socket socket{ios};
        boost::asio::connect(socket, lookup);

        http::request<http::string_body> req{http::verb::get, u.path().to_string(), 11};
        req.keep_alive(true);

        req.set(http::field::host, u.host().to_string());
        req.set(http::field::user_agent, BOOST_BEAST_VERSION_STRING);

        std::cout << "Target: " << url << "\n";
        std::cout << req << "\n";

        http::write(socket, req);
        boost::beast::flat_buffer buffer;
        http::response<http::dynamic_body> res;
        http::read(socket, buffer, res);

        switch(res.base().result_int()) {
            case 301: 
                std::cout << "Redirecting.....\n";
                do_request(res.base()["Location"].to_string());
                break;
            case 200:
                std::cout << res << "\n";
                break;
            default:
                std::cout << "Unexpected HTTP status " << res.result_int() << "\n";
                break;
        }
    }
  private:
    boost::asio::io_service ios;
    tcp::resolver resolver_{ios};
};

int main() {
    try {
        Requester requester;
        requester.do_request("http://boost.org/users/license.html"); // redirects to http://www.boost.org/...
    } catch(std::exception const& e) {
        std::cerr << "Exception: " << e.what() << "\n";
    }
}

This prints:

Target: http://boost.org/users/license.html
GET /users/license.html HTTP/1.1
Host: boost.org
User-Agent: Boost.Beast/109


Redirecting.....
Target: http://www.boost.org/users/license.html
GET /users/license.html HTTP/1.1
Host: www.boost.org
User-Agent: Boost.Beast/109


HTTP/1.1 200 OK
Date: Sun, 27 Aug 2017 22:25:20 GMT
Server: Apache/2.2.15 (CentOS)
Accept-Ranges: bytes
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

90fd
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
  <title>Boost Software License</title>
  <meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
  <link rel="icon" href="/favicon.ico" type="image/ico" />
  <link rel="stylesheet" type="text/css" href="../style-v2/section-boost.css" />
  <!--[if IE 7]> <style type="text/css"> body { behavior: url(/style-v2/csshover3.htc); } </style> <![endif]-->
</head><!--
Note: Editing website content is documented at:
http://www.boost.org/development/website_updating.html
-->

<body>
    ENTIRE LICENSE BODY SNIPPED
</body>
</html>

0
sehe
  • 374,641
  • 47
  • 450
  • 633
  • I've left using an SSL socket for HTTPS as an exercise for the reader, as the conceptual problems with what constitutes a HTTP redirect seem to take precedence. – sehe Aug 27 '17 at 22:30
  • In my case, I have to use a SSL socket because the resource is available by HTTPS. I understand that it only works if the Web server accept HTTP and HTTPS requests on the same port. – Juan Solo Aug 28 '17 at 08:28
  • Of course not. Webservers never do. Typically they run at port 80 vs. 443. That's why the resolve step uses the URI scheme. What's left is to start an Ssl socket. The samples should help you get started. – sehe Aug 28 '17 at 08:44
  • Sorry, I did not explain myself well. If the endpoint is the same for HTTP and HTTPS, I can make the request for the same connection, right? Because the resource is available using both protocols. – Juan Solo Aug 28 '17 at 09:34
  • I didn't answer clearly enough then. _"**Q.** If the endpoint is the same for HTTP and HTTPS"_ _"**A.** That's impossible"_. HTTPS connections _start_ with an SSL handshake, which means HTTP is IMPOSSIBLE at that end point. End of story. You cannot reuse the same connection on any standard webserver. Usually, at least the port is different, like I said. – sehe Aug 28 '17 at 09:38
  • Okay, that I understood. I misunderstood an example(advanced server flex) that is included in the library where I thought it was possible. – Juan Solo Aug 28 '17 at 10:16
  • 2
    Actually, it is completely possible to have both HTTP and HTTP/S on the same port. A couple of the Beast examples show how this is possible, see the "HTTP, flex" server and the "Advanced, flex" server here: http://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/examples.html This is accomplished using the "SSL detector" operation, which is described in the documentation: http://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/using_io/example_detect_ssl.html – Vinnie Falco Aug 28 '17 at 13:39
  • @VinnieFalco wow. Another thing learned. I'll look into it. Regardless, OP doesn't appear to be implementing server side. Also they will still have to account for SSL handshake+stream, am I right? – sehe Aug 28 '17 at 14:19
  • @VinnieFalco May I suggest you to create a boost-beast tag? – Jean Davy Aug 29 '17 at 08:46
  • 2
    @sehe Yes, if the program is redirected from a plain port to an SSL port then it will be required to run a different piece of code which uses `asio::ssl::stream` and performs the SSL handshake before making the HTTP request. – Vinnie Falco Sep 03 '17 at 14:31