diff --git a/doc/0_main.qbk b/doc/0_main.qbk index 9d03c9a1..f62407cb 100644 --- a/doc/0_main.qbk +++ b/doc/0_main.qbk @@ -57,9 +57,11 @@ [def __basic_fields__ [link beast.ref.http__basic_fields `basic_fields`]] [def __basic_multi_buffer__ [link beast.ref.basic_multi_buffer `basic_multi_buffer`]] [def __basic_parser__ [link beast.ref.http__basic_parser `basic_parser`]] +[def __buffer_body__ [link beast.ref.http__buffer_body `buffer_body`]] [def __fields__ [link beast.ref.http__fields `fields`]] [def __flat_buffer__ [link beast.ref.flat_buffer `flat_buffer`]] [def __header__ [link beast.ref.http__header `header`]] +[def __header_parser__ [link beast.ref.http__header_parser `header_parser`]] [def __message__ [link beast.ref.http__message `message`]] [def __message_parser__ [link beast.ref.http__message_parser `message_parser`]] [def __multi_buffer__ [link beast.ref.multi_buffer `multi_buffer`]] @@ -85,11 +87,6 @@ asynchronous model of __Asio__. ][ How to use the basic algorithms in your applications. ]] - [[ - [link beast.adv_http Advanced HTTP] - ][ - A discussion of the advanced interfaces. - ]] [[ [link beast.websocket Using WebSocket] ][ @@ -121,10 +118,9 @@ asynchronous model of __Asio__. [include 1_overview.qbk] [include 2_core.qbk] [include 3_0_http.qbk] -[include 4_0_adv_http.qbk] -[include 5_websocket.qbk] -[include 6_examples.qbk] -[include 7_0_design.qbk] +[include 4_websocket.qbk] +[include 5_examples.qbk] +[include 6_0_design.qbk] [section:ref Reference] [xinclude quickref.xml] diff --git a/doc/3_0_http.qbk b/doc/3_0_http.qbk index 635cb467..687476a1 100644 --- a/doc/3_0_http.qbk +++ b/doc/3_0_http.qbk @@ -11,13 +11,19 @@ HTTP Primer Message Containers - Stream Operations + Message Stream Operations + Serializer Stream Operations + Parser Stream Operations + Buffer-Oriented Serializing + Buffer-Oriented Parsing + Custom Parsers + Custom Body Types '''] -This library offers programmers simple and performant models of HTTP -messages and their associated operations including synchronous and -asynchronous parsing and serialization of messages in the HTTP/1 wire +This library offers programmers simple and performant models of HTTP messages +and their associated operations including synchronous, asynchronous, and +buffer-oriented parsing and serialization of messages in the HTTP/1 wire format using __Asio__. Specifically, the library provides: [variablelist @@ -32,10 +38,12 @@ format using __Asio__. Specifically, the library provides: [ The functions [link beast.ref.http__read `read`], + [link beast.ref.http__read_header `read_header`], [link beast.ref.http__read_some `read_some`], - [link beast.ref.http__async_read `async_read`], and + [link beast.ref.http__async_read `async_read`], + [link beast.ref.http__async_read_header `async_read_header`], and [link beast.ref.http__async_read_some `async_read_some`] - read a __message__ from a + read HTTP/1 message data from a [link beast.ref.streams stream]. ] ][ @@ -43,10 +51,12 @@ format using __Asio__. Specifically, the library provides: [ The functions [link beast.ref.http__write `write`], + [link beast.ref.http__write_header `write_header`], [link beast.ref.http__write_some `write_some`], - [link beast.ref.http__async_write `async_write`], and + [link beast.ref.http__async_write `async_write`], + [link beast.ref.http__async_write_header `async_write_header`], and [link beast.ref.http__async_write_some `async_write_some`] - write a __message__ to a + write HTTP/1 message data to a [link beast.ref.streams stream]. ] ][ @@ -79,5 +89,11 @@ format using __Asio__. Specifically, the library provides: [include 3_1_primer.qbk] [include 3_2_message.qbk] [include 3_3_streams.qbk] +[include 3_4_serializer_streams.qbk] +[include 3_5_parser_streams.qbk] +[include 3_6_serializer_buffers.qbk] +[include 3_7_parser_buffers.qbk] +[include 3_8_custom_parsers.qbk] +[include 3_9_custom_body.qbk] [endsect] diff --git a/doc/3_2_message.qbk b/doc/3_2_message.qbk index 753d9a13..0005d07d 100644 --- a/doc/3_2_message.qbk +++ b/doc/3_2_message.qbk @@ -26,18 +26,11 @@ accept any message, or can use partial specialization to accept just requests or responses. The default __fields__ is a provided associative container using the standard allocator and supporting modification and inspection of fields. As per __rfc7230__, a non-case-sensitive comparison -is used for field names. User defined types for fields are possible. This -is discussed in -[link beast.adv_http.fields Advanced Fields]. +is used for field names. User defined types for fields are possible. The `Body` type determines the type of the container used to represent the body as well as the algorithms for transferring buffers to and from the -the container. The library comes with a collection of common body types, -described in -[link beast.http.message.body Body Types]. -As with fields, user defined body types are possible. This is described in -[link beast.adv_http.body Advanced Body]. - - +the container. The library comes with a collection of common body types. +As with fields, user defined body types are possible. Sometimes it is desired to only work with a header. Beast provides a single class template __header__ to model HTTP/1 and HTTP/2 headers: @@ -88,9 +81,10 @@ the __Body__ requirements: [[ [link beast.ref.http__buffer_body `buffer_body`] ][ - A body with `value_type` holding a __ConstBufferSequence__ holding - caller provided buffers which must be updated during incremental - serialization. Messages with this body type only support serialization. + A body whose `value_type` holds a raw pointer and size to a + caller-provided buffer. This allows for serialization of body data + coming from external sources, and incremental parsing of message + body content using a fixed size buffer. ]] [[ [link beast.ref.http__dynamic_body `dynamic_body`] diff --git a/doc/3_3_streams.qbk b/doc/3_3_streams.qbk index 92c3ed55..1d3b49f0 100644 --- a/doc/3_3_streams.qbk +++ b/doc/3_3_streams.qbk @@ -5,7 +5,7 @@ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ] -[section:streams Stream Operations] +[section:streams Message Stream Operations] Beast provides synchronous and asynchronous algorithms to serialize and parse HTTP/1 wire format messages on streams. These functions form the @@ -19,16 +19,16 @@ requiring no separately managed state objects: ][ Parse a __message__ from a __SyncReadStream__. ]] -[[ - [link beast.ref.http__write.overload1 [*write]] -][ - Serialize a __message__ to a __SyncWriteStream__. -]] [[ [link beast.ref.http__async_read.overload2 [*async_read]] ][ Parse a __message__ from an __AsyncReadStream__. ]] +[[ + [link beast.ref.http__write.overload1 [*write]] +][ + Serialize a __message__ to a __SyncWriteStream__. +]] [[ [link beast.ref.http__async_write [*async_write]] ][ @@ -36,11 +36,54 @@ requiring no separately managed state objects: ]] ] -Synchronous stream operations come in two varieties. One which throws +All synchronous stream operations come in two varieties. One which throws an exception upon error, and another which accepts as the last parameter an argument of type [link beast.ref.error_code `error_code&`]. If an error occurs this argument will be set to contain the error code. +[heading Reading] + +Because a serialized header is not length-prefixed, algorithms which parse +messages from a stream may read past the end of a message for efficiency. +To hold this surplus data, all stream read operations use a passed-in +__DynamicBuffer__ which persists between calls. Each read operation may +consume bytes remaining in the buffer, and leave behind new bytes. In this +example we declare the buffer and a message variable, then read a complete +HTTP request synchronously: +``` + flat_buffer buffer; // (The parser is optimized for flat buffers) + request req; + read(sock, buffer, req); +``` + +In this example we used the __flat_buffer__. The parser in Beast is +optimized for structured HTTP data located in a single contiguous memory +buffer ("flat buffer"). Any dynamic buffer will work with reads. However, +when not using a flat buffer the implementation may perform an additional +memory allocation to restructure the input into a single buffer. + +[tip + User-defined implementations of __DynamicBuffer__ may avoid additional + parser memory allocation, if those implementations guarantee that + returned buffer sequences will always have length one. +] + +Messages may also be read asynchronously. When performing asynchronous +stream read operations, the buffer and message variables must remain +valid until the operation has completed. Beast asynchronous initiation +functions use Asio's completion handler model. Here we read a message +asynchronously. When the operation completes the message in the error +code indicating the result is printed: +``` + flat_buffer buffer; + response res; + async_read(sock, buffer, + [&](error_code ec) + { + std::cerr << ec.message() << std::endl; + }); +``` + [heading Writing] A set of free functions allow serialization of an entire HTTP message to @@ -95,46 +138,4 @@ void send_async(response const& res) } ``` -[heading Reading] - -Because a serialized header is not length-prefixed, algorithms which parse -messages from a stream may read past the end of a message for efficiency. -To hold this surplus data, all stream read operations use a passed-in -__DynamicBuffer__. Each read operation may consume bytes remaining in the -buffer, and leave behind new bytes. In this example we declare the buffer -and a message variable, then read a complete HTTP request synchronously: -``` - flat_buffer buffer; // (The parser is optimized for flat buffers) - request req; - read(sock, buffer, req); -``` - -In this example we used the __flat_buffer__. The parser in Beast is -optimized for structured HTTP data located in a single contiguous memory -buffer ("flat buffer"). Any dynamic buffer will work with reads. However, -when not using a flat buffer the implementation may perform an additional -memory allocation to restructure the input into a single buffer. - -[tip - User-defined implementations of __DynamicBuffer__ may avoid additional - parser memory allocation, if those implementations guarantee that - returned buffer sequences will always have length one. -] - -Messages may also be read asynchronously. When performing asynchronous -stream read operations, the buffer and message variables must remain -valid until the operation has completed. Beast asynchronous initiation -functions use Asio's completion handler model. Here we read a message -asynchronously. When the operation completes the message in the error -code indicating the result is printed: -``` - flat_buffer buffer; - response res; - async_read(sock, buffer, - [&](error_code ec) - { - std::cerr << ec.message() << std::endl; - }); -``` - [endsect] diff --git a/doc/3_4_serializer_streams.qbk b/doc/3_4_serializer_streams.qbk new file mode 100644 index 00000000..e94d2f3d --- /dev/null +++ b/doc/3_4_serializer_streams.qbk @@ -0,0 +1,306 @@ +[/ + Copyright (c) 2013-2017 Vinnie Falco (vinnie dot falco at gmail dot com) + + Distributed under the Boost Software License, Version 1.0. (See accompanying + file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +] + +[section:serializer_streams Serializer Stream Operations] + +Algorithms for sending entire messages to streams are intended for light +duty use-cases such as simple clients and low utilization servers. +Sophisticated algorithms will need to do more: + +* Send the message header first. + +* Send a message incrementally: bounded work in each I/O cycle. + +* Use a custom chunk decorator or allocator when sending messages. + +* Use a series of caller-provided buffers to represent the body. + +All of these operations require callers to manage the lifetime of state +information associated with the operation, by constructing a __serializer__ +object with the message to be sent. The serializer type has this declaration: +``` +template< + bool isRequest, + class Body, + class Fields, + class Decorator = empty_decorator, + class Allocator = std::allocator +> +class serializer; +``` + +The choices for template types must match the message passed on construction. +This code creates an HTTP response and the corresponding serializer: +``` + response res; + ... + serializer sr{res}; +``` +The convenience function +[link beast.ref.http__make_serializer `make_serializer`] +is provided to avoid repetition of template argument types. The declaration +for `sr` in the code above may be written as: +``` + ... + auto sr = make_serializer(res); +``` + +The stream operations which work on serializers are: + +[table Serializer Stream Operations +[[Name][Description]] +[[ + [link beast.ref.http__write_some.overload1 [*write_some]] +][ + Send some __serializer__ buffer data to a __SyncWriteStream__. +]] +[[ + [link beast.ref.http__async_write_some [*async_write_some]] +][ + Send some __serializer__ buffer data asynchronously to an __AsyncWriteStream__. +]] +[[ + [link beast.ref.http__write_header.overload1 [*write_header]] +][ + Send only the header from a __serializer__ to a __SyncWriteStream__. +]] +[[ + [link beast.ref.http__async_write_header [*async_write_header]] +][ + Send only the header from a __serializer__ asynchronously to an __AsyncWriteStream__. +]] +[[ + [link beast.ref.http__write.overload1 [*write]] +][ + Send everything in a __serializer__ to a __SyncWriteStream__. +]] +[[ + [link beast.ref.http__async_write.overload1 [*async_write]] +][ + Send everything in a __serializer__ asynchronously to an __AsyncWriteStream__. +]] +] + +Here is an example of using a serializer to send a message on a stream +synchronously. This performs the same operation as calling `write(stream, m)`: + +``` +template +void send(SyncWriteStream& stream, message const& m) +{ + static_assert(is_sync_write_stream::value, + "SyncWriteStream requirements not met"); + serializer sr{m}; + do + { + write_some(stream, sr); + } + while(! sr.is_done()); +} +``` + +[heading Example: Expect 100-continue] + +The Expect field with the value "100-continue" in a request is special. It +indicates that the after sending the message header, a client desires an +immediate informational response before sending the the message body, which +presumably may be expensive to compute or large. This behavior is described in +[@https://tools.ietf.org/html/rfc7231#section-5.1.1 rfc7231 section 5.1.1]. +Invoking the 100-continue behavior is implemented easily in a client by +constructing a __serializer__ to send the header first, then receiving +the server response, and finally conditionally send the body using the same +serializer instance. A synchronous, simplified version (no timeout) of +this client action looks like this: +``` +/** Send a request with Expect: 100-continue + + This function will send a request with the Expect: 100-continue + field by first sending the header, then waiting for a successful + response from the server before continuing to send the body. If + a non-successful server response is received, the function + returns immediately. + + @param stream The remote HTTP server stream. + + @param buffer The buffer used for reading. + + @param req The request to send. This function modifies the object: + the Expect header field is inserted into the message if it does + not already exist, and set to "100-continue". + + @param ec Set to the error, if any occurred. +*/ +template< + class SyncStream, + class DynamicBuffer, + class Body, class Fields> +void +send_expect_100_continue( + SyncStream& stream, + DynamicBuffer& buffer, + request& req) +{ + static_assert(is_sync_stream::value, + "SyncStream requirements not met"); + + static_assert(is_dynamic_buffer::value, + "DynamicBuffer requirements not met"); + + // Insert or replace the Expect field + req.fields.replace("Expect", "100-continue"); + + // Create the serializer + auto sr = make_serializer(req); + + // Send just the header + write_header(stream, sr); + + // Read the response from the server. + // A robust client could set a timeout here. + { + response res; + read(stream, buffer, res); + if(res.status != 100) + { + // The server indicated that it will not + // accept the request, so skip sending the body. + return; + } + } + + // Server is OK with the request, send the body + write(stream, sr); +} +``` + +[heading Example: Using Manual Buffers] + +Sometimes it is necessary to send a message whose body is not conveniently +described by a single container. For example, when implementing an HTTP relay +function a robust implementation needs to present body buffers individually +as they become available from the downstream host. These buffers should be +fixed in size, otherwise creating the unnecessary and inefficient burden of +reading the complete message body before forwarding it to the upstream host. + +To enable these use-cases, the body type __buffer_body__ is provided. This +body uses a caller-provided pointer and size instead of an owned container. +To use this body, instantiate an instance of the serializer and fill in +the pointer and size fields before calling a stream write function. + +This example reads from a child process and sends the output back in an +HTTP response. The output of the process is sent as it becomes available: +``` +/** Send the output of a child process as an HTTP response. + + The output of the child process comes from a @b SyncReadStream. Data + will be sent continuously as it is produced, without the requirement + that the entire process output is buffered before being sent. The + response will use the chunked transfer encoding. + + @param input A stream to read the child process output from. + + @param output A stream to write the HTTP response to. + + @param ec Set to the error, if any occurred. +*/ +template< + class SyncReadStream, + class SyncWriteStream> +void +send_cgi_response( + SyncReadStream& input, + SyncWriteStream& output, + error_code& ec) +{ + static_assert(is_sync_read_stream::value, + "SyncReadStream requirements not met"); + + static_assert(is_sync_write_stream::value, + "SyncWriteStream requirements not met"); + + using boost::asio::buffer_cast; + using boost::asio::buffer_size; + + // Set up the response. We use the buffer_body type, + // allowing serialization to use manually provided buffers. + message res; + + res.status = 200; + res.version = 11; + res.fields.insert("Server", "Beast"); + res.fields.insert("Transfer-Encoding", "chunked"); + + // No data yet, but we set more = true to indicate + // that it might be coming later. Otherwise the + // serializer::is_done would return true right after + // sending the header. + res.body.data = nullptr; + res.body.more = true; + + // Create the serializer. We set the split option to + // produce the header immediately without also trying + // to acquire buffers from the body (which would return + // the error http::need_buffer because we set `data` + // to `nullptr` above). + auto sr = make_serializer(res); + + // Send the header immediately. + write_header(output, sr, ec); + if(ec) + return; + + // Alternate between reading from the child process + // and sending all the process output until there + // is no more output. + do + { + // Read a buffer from the child process + char buffer[2048]; + auto bytes_transferred = input.read_some( + boost::asio::buffer(buffer, sizeof(buffer)), ec); + if(ec == boost::asio::error::eof) + { + ec = {}; + + // `nullptr` indicates there is no buffer + res.body.data = nullptr; + + // `false` means no more data is coming + res.body.more = false; + } + else + { + if(ec) + return; + + // Point to our buffer with the bytes that + // we received, and indicate that there may + // be some more data coming + res.body.data = buffer; + res.body.size = bytes_transferred; + res.body.more = true; + } + + // Write everything in the body buffer + write(output, sr, ec); + + // This error is returned by body_buffer during + // serialization when it is done sending the data + // provided and needs another buffer. + if(ec == error::need_buffer) + { + ec = {}; + continue; + } + if(ec) + return; + } + while(! sr.is_done()); +} +``` + +[endsect] diff --git a/doc/3_5_parser_streams.qbk b/doc/3_5_parser_streams.qbk new file mode 100644 index 00000000..9966de4a --- /dev/null +++ b/doc/3_5_parser_streams.qbk @@ -0,0 +1,332 @@ +[/ + Copyright (c) 2013-2017 Vinnie Falco (vinnie dot falco at gmail dot com) + + Distributed under the Boost Software License, Version 1.0. (See accompanying + file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +] + +[section:parser_streams Parser Stream Operations] + +Algorithms for receiving entire messages from streams are helpful for simple +use-cases. Sophisticated algorithms will need to do more: + +* Receive the message header first. + +* Receive a message incrementally: bounded work in each I/O cycle. + +* Receive an arbitrarily-sized body using a fixed-size buffer. + +* Defer the commitment to a __Body__ type until after reading the header. + +All of these operations require callers to manage the lifetime of state +information associated with the operation, by constructing a class derived +from __basic_parser__. Beast comes with two instances of parsers, and user +defined types deriving from the basic parser are possible: + +[table Parser Implementations +[[Name][Description]] +[[ + __message_parser__ +][ + ``` + /// A parser for a message + template< + bool isRequest, // `true` to parse an HTTP request + class Body, // The Body type for the resulting message + class Fields> // The type of container representing the fields + class message_parser + : public basic_parser<...>; + ``` +]] +[[ + __header_parser__ +][ + ``` + /// A parser for a header + template< + bool isRequest, // `true` to parse an HTTP request + class Fields> // The type of container representing the fields + class header_parser + : public basic_parser<...>; + ``` +]] +] + +[note + The __basic_parser__ and classes derived from it handle octet streams + serialized in the HTTP/1 format described in __rfc7230__. +] + +The stream operations which work on parsers are: + +[table Parser Stream Operations +[[Name][Description]] +[[ + [link beast.ref.http__read_some.overload1 [*read_some]] +][ + Read some octets into a parser from a __SyncReadStream__. +]] +[[ + [link beast.ref.http__async_read_some [*async_read_some]] +][ + Read some octets into a parser asynchronously from an __AsyncWriteStream__. +]] +[[ + [link beast.ref.http__read_header.overload1 [*read_header]] +][ + Read only the header octets into a parser from a __SyncWriteStream__. +]] +[[ + [link beast.ref.http__async_read_header [*async_read_header]] +][ + Read only the header octets into a parser asynchronously from an __AsyncWriteStream__. +]] +[[ + [link beast.ref.http__read.overload1 [*read]] +][ + Read everything into a parser from a __SyncWriteStream__. +]] +[[ + [link beast.ref.http__async_read.overload1 [*async_read]] +][ + Read everything into a parser asynchronously from an __AsyncWriteStream__. +]] +] + +As with the stream parse algorithms which operate on entire messages, stream +operations for parsers require a passed-in __DynamicBuffer__ which persists +between calls to hold unused octets from the stream. The basic parser +implementation is optimized for the case where this dynamic buffer stores +its input sequence in a single contiguous memory buffer. It is advised to +use an instance of __flat_buffer__ for this purpose, although a user defined +instance of __DynamicBuffer__ which produces input sequences of length one +is also suitable. + +The provided parsers use a "captive object" model, acting as container for +the __message__ or __header__ produced as a result of parsing. The caller +accesses the contained object, and depending on the types used to instantiate +the parser, it may be possible to acquire ownership of the header or message +captive object and destroy the parser. In this example we read an HTTP +response with a string body using a parser, then print the response: +``` +template +void print_response(SyncReadStream& stream) +{ + static_assert(is_sync_read_stream::value, + "SyncReadStream requirements not met"); + + // Declare a parser for an HTTP response + response_parser parser; + + // Read the entire message + read(stream, parser); + + // Now print the message + std::cout << parser.get() << std::endl; +} +``` + +[heading Example: 100-continue] + +The Expect field with the value "100-continue" in a request is special. It +indicates that the after sending the message header, a client desires an +immediate informational response before sending the the message body, which +presumably may be expensive to compute or large. This behavior is described in +[@https://tools.ietf.org/html/rfc7231#section-5.1.1 rfc7231 section 5.1.1]. +Handling the Expect field can be implemented easily in a server by constructing +a __message_parser__ to read the header first, then send an informational +HTTP response, and finally read the body using the same parser instance. A +synchronous version of this server action looks like this: +``` +/** Receive a request, handling Expect: 100-continue if present. + + This function will read a request from the specified stream. + If the request contains the Expect: 100-continue field, a + status response will be delivered. + + @param stream The remote HTTP client stream. + + @param buffer The buffer used for reading. + + @param ec Set to the error, if any occurred. +*/ +template< + class SyncStream, + class DynamicBuffer> +void +receive_expect_100_continue( + SyncStream& stream, + DynamicBuffer& buffer, + error_code& ec) +{ + static_assert(is_sync_stream::value, + "SyncStream requirements not met"); + + static_assert(is_dynamic_buffer::value, + "DynamicBuffer requirements not met"); + + // Declare a parser for a request with a string body + request_parser parser; + + // Read the header + read_header(stream, buffer, parser, ec); + if(ec) + return; + + // Check for the Expect field value + if(parser.get().fields["Expect"] == "100-continue") + { + // send 100 response + response res; + res.version = 11; + res.status = 100; + res.reason("Continue"); + res.fields.insert("Server", "test"); + write(stream, res, ec); + if(ec) + return; + } + + // Read the rest of the message. + // + // We use parser.base() to return a basic_parser&, to avoid an + // ambiguous function error (from boost::asio::read). Another + // solution is to qualify the call, e.g. `beast::http::read` + // + read(stream, buffer, parser.base(), ec); +} +``` + +[heading Example: HTTP Relay] + +An HTTP proxy acts as a relay between client and server. The proxy reads a +request from the client and sends it to the server, possibly adjusting some +of the headers and representation of the body along the way. Then, the +proxy reads a response from the server and sends it back to the client, +also with the possibility of changing the headers and body representation. + +The example that follows implements a synchronous HTTP relay. It uses a +fixed size buffer, to avoid reading in the entire body so that the upstream +connection sees a header without unnecessary latency. This example brings +together all of the concepts discussed so far, it uses both a __serializer__ +and a __message_parser__ to achieve its goal: +``` +/** Relay an HTTP message. + + This function efficiently relays an HTTP message from a downstream + client to an upstream server, or from an upstream server to a + downstream client. After the message header is read from the input, + a user provided transformation function is invoked which may change + the contents of the header before forwarding to the output. This may + be used to adjust fields such as Server, or proxy fields. + + @param output The stream to write to. + + @param input The stream to read from. + + @param buffer The buffer to use for the input. + + @param transform The header transformation to apply. The function will + be called with this signature: + @code + void transform( + header&, // The header to transform + error_code&); // Set to the error, if any + @endcode + + @param ec Set to the error if any occurred. + + @tparam isRequest `true` to relay a request. + + @tparam Fields The type of fields to use for the message. +*/ +template< + bool isRequest, + class Fields = fields, + class SyncWriteStream, + class SyncReadStream, + class DynamicBuffer, + class Transform> +void +relay( + SyncWriteStream& output, + SyncReadStream& input, + DynamicBuffer& buffer, + error_code& ec, + Transform&& transform) +{ + static_assert(is_sync_write_stream::value, + "SyncWriteStream requirements not met"); + + static_assert(is_sync_read_stream::value, + "SyncReadStream requirements not met"); + + // A small buffer for relaying the body piece by piece + char buf[2048]; + + // Create a parser with a buffer body to read from the input. + message_parser p; + + // Create a serializer from the message contained in the parser. + serializer sr{p.get()}; + + // Read just the header from the input + read_header(input, buffer, p, ec); + if(ec) + return; + + // Apply the caller's header tranformation + // base() returns a reference to the header portion of the message. + transform(p.get().base(), ec); + if(ec) + return; + + // Send the transformed message to the output + write_header(output, sr, ec); + if(ec) + return; + + // Loop over the input and transfer it to the output + do + { + if(! p.is_done()) + { + // Set up the body for writing into our small buffer + p.get().body.data = buf; + p.get().body.size = sizeof(buf); + + // Read as much as we can + read(input, buffer, p, ec); + + // This error is returned when buffer_body uses up the buffer + if(ec == error::need_buffer) + ec = {}; + if(ec) + return; + + // Set up the body for reading. + // This is how much was parsed: + p.get().body.size = sizeof(buf) - p.get().body.size; + p.get().body.data = buf; + p.get().body.more = ! p.is_done(); + } + else + { + p.get().body.data = nullptr; + p.get().body.size = 0; + } + + // Write everything in the buffer (which might be empty) + write(output, sr, ec); + + // This error is returned when buffer_body uses up the buffer + if(ec == error::need_buffer) + ec = {}; + if(ec) + return; + } + while(! p.is_done() && ! sr.is_done()); +} +``` + +[endsect] diff --git a/doc/4_0_adv_http.qbk b/doc/3_6_serializer_buffers.qbk similarity index 55% rename from doc/4_0_adv_http.qbk rename to doc/3_6_serializer_buffers.qbk index 1541b4ab..bcccd3c8 100644 --- a/doc/4_0_adv_http.qbk +++ b/doc/3_6_serializer_buffers.qbk @@ -5,58 +5,20 @@ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ] -[section:adv_http Advanced HTTP] +[section:serializer_buffers Buffer-Oriented Serializing] -[block ''' - - Serialization - Parsing - Field Containers - Body Types - -'''] +In extreme cases, users may wish to create an instance of __serializer__ +and invoke its methods directly instead of using the provided stream +algorithms. This could be useful for implementing algorithms on streams +whose asynchronous interface does not conform to __AsyncStream__. For +example, a +[@https://github.com/libuv/libuv *libuv* socket]. -The basic interfaces for reading and writing complete messages are -simple to use and convenient, but may not serve the needs of advanced -use cases, including: +The serializer interface is interactive; the caller invokes it repeatedly to +produce buffers until all of the buffers have been generated. Then the +serializer is destroyed. -* Parsing a message from caller-provided buffers - -* Serializing a message to caller-provided buffers - -* Reading a message from a stream incrementally - -* Writing a message to a stream incrementally - -In some cases, users may wish to provide their own implementation for -the fields container and the body type. In the advanced sections we -discuss the buffer oriented message algorithms, incremental stream -algorithms, and customization points for messages. - - - -[section:serialize Serialization (Advanced HTTP)] - -Beast uses the __serializer__ class internally to generate a sequence of -buffers corresponding to the serialized representation of a __message__. -The basic algorithms send the entire set of buffers at once while the -incremental algorithms allow the caller to write a bounded amount to -the stream at each iteration. In between calls to generate buffers or -write buffers to the stream, the internal state of the serialization -must be saved. A __serializer__ initializes this internal state and -stores it in between calls to produce buffers, until all the message -buffers have been produced. Afterwards, the state object may be destroyed. - -The serializer is a class template constructed from an existing -message. The template types used to instantiate the serializer must -match the types in the message. Here we declare a request and construct -an accompanying serializer: -``` -request req; -serializer sr{req}; -``` - -The buffers are produced by first calling +After the serializer is created, the buffers are produced by first calling [link beast.ref.http__serializer.get `serializer::get`] to obtain a buffer sequence, and then calling [link beast.ref.http__serializer.consume `serializer::consume`] @@ -128,56 +90,6 @@ void print(message const& m) } ``` -[heading Stream Operations] - -When working with streams, the use of an explicitly declared serializer -allows control over the amount of network activity performed in each -system call. This allows for better application level flow control and -predictable timeouts. These advanced stream write operations work on -an object of type __serializer__ which has already been constructed, -and which must remain valid until the operation is complete: - -[table Advanced Streaming -[[Name][Description]] -[[ - [link beast.ref.http__write_some.overload1 [*write_some]] -][ - Send __serializer__ buffer data to a __SyncWriteStream__. -]] -[[ - [link beast.ref.http__write_header.overload1 [*write_header]] -][ - Send an entire header from a __serializer__ to a __SyncWriteStream__. -]] -[[ - [link beast.ref.http__async_write_some [*async_write_some]] -][ - Send some __serializer__ buffer data to an __AsyncWriteStream__. -]] -[[ - [link beast.ref.http__async_write_header [*async_write_header]] -][ - Send an entire header from a __serializer__ to a __AsyncWriteStream__. -]] -] - -Here is an example which synchronously sends a message on a stream using -a serializer: -``` -template -void send(SyncWriteStream& stream, message const& m) -{ - static_assert(is_sync_write_stream::value, - "SyncWriteStream requirements not met"); - serializer sr{m}; - do - { - write_some(stream, sr); - } - while(! sr.is_done()); -} -``` - [heading Split Serialization] In some cases, such as the handling of the @@ -286,64 +198,3 @@ struct decorator ``` [endsect] - - - -[section:parsing Parsing (Advanced HTTP)] - -[endsect] - - - -[section:fields Field Containers (Advanced HTTP)] - -[endsect] - - - -[section:body Body Types (Advanced HTTP)] - -User-defined types are possible for the message body, where the type meets the -[link beast.ref.Body [*`Body`]] requirements. This simplified class declaration -shows the customization points available to user-defined body types: - -[$images/body.png [width 525px] [height 190px]] - -The meanin of the nested types is as follows - -[table Body Type Members -[[Name][Description]] -[ - [`value_type`] - [ - Determines the type of the - [link beast.ref.http__message.body `message::body`] member. If this - type defines default construction, move, copy, or swap, then message objects - declared with this [*Body] will have those operations defined. - ] -][ - [`body_writer`] - [ - An optional nested type meeting the requirements of - [link beast.ref.BodyWriter [*BodyWriter]]. If present, this defines the - algorithm used to transfer parsed octets into buffers representing the - body. - ] -][ - [`body_reader`] - [ - An optional nested type meeting the requirements of - [link beast.ref.BodyReader [*BodyReader]]. If present, this defines - the algorithm used to obtain buffers representing a body of this type. - ] -] -] - -The examples included with this library provide a [*Body] implementation that -serializing message bodies that come from a file. - -[endsect] - - - -[endsect] diff --git a/doc/3_7_parser_buffers.qbk b/doc/3_7_parser_buffers.qbk new file mode 100644 index 00000000..a90da4d4 --- /dev/null +++ b/doc/3_7_parser_buffers.qbk @@ -0,0 +1,10 @@ +[/ + Copyright (c) 2013-2017 Vinnie Falco (vinnie dot falco at gmail dot com) + + Distributed under the Boost Software License, Version 1.0. (See accompanying + file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +] + +[section:parser_buffers Buffer-Oriented Parsing] + +[endsect] diff --git a/doc/3_8_custom_parsers.qbk b/doc/3_8_custom_parsers.qbk new file mode 100644 index 00000000..3858a582 --- /dev/null +++ b/doc/3_8_custom_parsers.qbk @@ -0,0 +1,10 @@ +[/ + Copyright (c) 2013-2017 Vinnie Falco (vinnie dot falco at gmail dot com) + + Distributed under the Boost Software License, Version 1.0. (See accompanying + file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +] + +[section:custom_parsers Custom Parsers] + +[endsect] diff --git a/doc/3_9_custom_body.qbk b/doc/3_9_custom_body.qbk new file mode 100644 index 00000000..2f3f1f3b --- /dev/null +++ b/doc/3_9_custom_body.qbk @@ -0,0 +1,49 @@ +[/ + Copyright (c) 2013-2017 Vinnie Falco (vinnie dot falco at gmail dot com) + + Distributed under the Boost Software License, Version 1.0. (See accompanying + file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) +] + +[section:custom_body Custom Body Types] + +User-defined types are possible for the message body, where the type meets the +[link beast.ref.Body [*`Body`]] requirements. This simplified class declaration +shows the customization points available to user-defined body types: + +[$images/body.png [width 525px] [height 190px]] + +The meaning of the nested types is as follows + +[table Body Type Members +[[Name][Description]] +[ + [`value_type`] + [ + Determines the type of the + [link beast.ref.http__message.body `message::body`] member. If this + type defines default construction, move, copy, or swap, then message objects + declared with this [*Body] will have those operations defined. + ] +][ + [`body_writer`] + [ + An optional nested type meeting the requirements of + [link beast.ref.BodyWriter [*BodyWriter]]. If present, this defines the + algorithm used to transfer parsed octets into buffers representing the + body. + ] +][ + [`body_reader`] + [ + An optional nested type meeting the requirements of + [link beast.ref.BodyReader [*BodyReader]]. If present, this defines + the algorithm used to obtain buffers representing a body of this type. + ] +] +] + +The examples included with this library provide a [*Body] implementation that +serializing message bodies that come from a file. + +[endsect] diff --git a/doc/5_websocket.qbk b/doc/4_websocket.qbk similarity index 100% rename from doc/5_websocket.qbk rename to doc/4_websocket.qbk diff --git a/doc/6_examples.qbk b/doc/5_examples.qbk similarity index 100% rename from doc/6_examples.qbk rename to doc/5_examples.qbk diff --git a/doc/7_0_design.qbk b/doc/6_0_design.qbk similarity index 95% rename from doc/7_0_design.qbk rename to doc/6_0_design.qbk index a024f7db..619bb31c 100644 --- a/doc/7_0_design.qbk +++ b/doc/6_0_design.qbk @@ -59,9 +59,9 @@ start. Other design goals: * Allow for customizations, if the user needs it. -[include 7_1_http_message.qbk] -[include 7_2_http_comparison.qbk] -[include 7_3_websocket_zaphoyd.qbk] -[include 7_4_review.qbk] +[include 6_1_http_message.qbk] +[include 6_2_http_comparison.qbk] +[include 6_3_websocket_zaphoyd.qbk] +[include 6_4_review.qbk] [endsect] diff --git a/doc/7_1_http_message.qbk b/doc/6_1_http_message.qbk similarity index 100% rename from doc/7_1_http_message.qbk rename to doc/6_1_http_message.qbk diff --git a/doc/7_2_http_comparison.qbk b/doc/6_2_http_comparison.qbk similarity index 100% rename from doc/7_2_http_comparison.qbk rename to doc/6_2_http_comparison.qbk diff --git a/doc/7_3_websocket_zaphoyd.qbk b/doc/6_3_websocket_zaphoyd.qbk similarity index 100% rename from doc/7_3_websocket_zaphoyd.qbk rename to doc/6_3_websocket_zaphoyd.qbk diff --git a/doc/7_4_review.qbk b/doc/6_4_review.qbk similarity index 100% rename from doc/7_4_review.qbk rename to doc/6_4_review.qbk diff --git a/doc/concept/BodyReader.qbk b/doc/concept/BodyReader.qbk index 001eed45..3d0a20e6 100644 --- a/doc/concept/BodyReader.qbk +++ b/doc/concept/BodyReader.qbk @@ -50,6 +50,17 @@ In this table: [`X::is_deferred`] [] [ + The type `std::true_type` if the serialization implementation + should only attempt to retrieve buffers from the reader after + the header has been serialized. Otherwise, if this type is + `std::false_type` the implementation will activate an + optimization: the first buffer produced during serialization + will contain both the header and some or all of this body. + + Implementations of [*BodyReader] for which initialization is + expensive, should use `std::false_type` here, to reduce the + latency experienced by the remote host when expecting to read + the HTTP header. ] ] [ @@ -124,19 +135,17 @@ public: /** Controls when the implementation requests buffers. If false, the implementation will request the first buffer - immediately and try to send both the header and the body - buffer in a single call to the stream's `write_some` - function. + immediately and try to serialize both the header and some + or all of the body in a single buffer. */ using is_deferred = std::false_type; - /** The type of buffer returned by `get`. - */ + /// The type of buffer returned by `get`. using const_buffers_type = boost::asio::const_buffers_1; /** Construct the reader. - @param msg The message whose body is to be written. + @param msg The message whose body is to be retrieved. */ template explicit diff --git a/doc/concept/BodyWriter.qbk b/doc/concept/BodyWriter.qbk index ab7be12f..62b245f4 100644 --- a/doc/concept/BodyWriter.qbk +++ b/doc/concept/BodyWriter.qbk @@ -7,206 +7,76 @@ [section:BodyWriter BodyWriter requirements] -When HTTP messages are parsed, the implementation constructs a -[*BodyWriter] object to provide the means for transferring parsed body -octets into the message container. These body writers come in two flavors, -direct and indirect: +A [*BodyWriter] provides an online algorithm to transfer a series of zero +or more buffers containing parsed body octets into a message container. The +__message_parser__ creates an instance of this type when needed, and calls into +it zero or more times to transfer buffers. The interface of [*BodyWriter] +is intended to allow the conversion of buffers into these scenarios for +representation: -Direct writers provide a buffer to callers, into which body data is placed. -This type of writer is used when the bytes corresponding to the body data -are stored without transformation. The parse algorithm performs stream or -socket reads directly into the buffer provided by the writer, hence the name -"direct." This model avoids an unnecessary buffer copy. An example of -a __Body__ type with a direct writer is -[link beast.ref.http__string_body `string_body`]. +* Storing a body in a dynamic buffer +* Storing a body in a user defined container with a custom allocator +* Transformation of incoming body data before storage, for example + to compress it first. +* Saving body data to a file -Indirect writers are passed body data in a buffer managed by the parsing -algorithm. This writer is appropriate when the body data is transformed -or not otherwised stored verbatim. Some examples of when an indirect -writer is appropriate: +In the table below: -* When bytes corresponding to the body are written to a file - as they are parsed. - -* The content of the message is JSON, which is parsed as it is - being read in, and stored in a structured, hierarchical format. - -In the tables below: - -* `X` denotes a type meeting the requirements of [*Writer]. +* `X` denotes a type meeting the requirements of [*BodyWriter]. * `B` denotes a __Body__ where `std::is_same::value == true`. * `a` denotes a value of type `X`. -* `n` is a value convertible to `std::size_t` without loss of precision. +* `length` is a value of type `boost::optional`. -* `v` is a value convertible to `std::uint64_t` without loss of precision. - -* `s` is a value of type [link beast.ref.string_view `string_view`]. - -* `ec` is a value of type [link beast.ref.error_code `error_code&`]. +* `b` is an object whose type meets the requirements of __ConstBufferSequence__ * `m` denotes a value of type `message&` where `std::is_same::value == true`. -[table Direct Writer requirements +* `ec` is a value of type [link beast.ref.error_code `error_code&`]. + +[table Writer requirements [[expression] [type] [semantics, pre/post-conditions]] [ - [`X::is_direct`] - [`bool`] - [ - This static constant must be set to `true` to indicate this - is a direct writer. - ] -] -[ - [`X::mutable_buffers_type`] + [`X(m);`] [] [ - A type which meets the requirements of __MutableBufferSequence__. + Constructible from `m`. The lifetime of `m` is guaranteed + to end no earlier than after the `X` is destroyed. The constructor + will be called the complete header is stored in `m`, and before + parsing body octets for messages indicating that a body is present. ] ] [ - [`X a{m};`] + [`a.init(length,ec)`] [] [ - `a` is constructible from `m`. The lifetime of `m` is guaranteed - to end no earlier than after `a` is destroyed. The constructor - will be called after all headers have been stored in `m`, and - just before parsing bytes corresponding to the body for messages - whose semantics indicate that a body is present with non-zero - length. + This function is called after construction and before any body + octets are presented to the writer. The value of `length` will + be set to the content length of the body if known, otherwise + `length` will be equal to `boost::none`. Implementations of + [*BodyWriter] may use this information to optimize allocation. + If `ec` is set, the error will be propagated to the caller. ] ] [ - [`a.init()`] + [`a.put(b,ec)`] [] [ - This function is called once before any bytes corresponding - to the body are presented to the writer, for messages whose - body is determined by the end-of-file marker on a stream, - or for messages where the chunked Transfer-Encoding is - specified. - ] -] -[ - [`a.init(v)`] - [] - [ - This function is called once before any bytes corresponding - to the body are presented to the writer, for messages where - the Content-Length is specified. The value of `v` will be - set to the number of bytes indicated by the content length. - ] -] -[ - [`a.prepare(n)`] - [`X::mutable_buffers_type`] - [ - The implementation calls this function to obtain a mutable - buffer sequence of up to `n` bytes in size in which to place - data corresponding to the body. The buffer returned must - be at least one byte in size, and may be smaller than `n`. - ] -] -[ - [`a.commit(n)`] - [] - [ - The implementation calls this function to indicate to the - writer that `n` bytes of data have been successfully placed - into the buffer obtained through a prior call to `prepare`. - The value of `n` will be less than or equal to the size of - the buffer returned in the previous call to `prepare`. - ] -] -[ - [`a.finish()`] - [] - [ - This function is called after all the bytes corresponding - to the body have been written to the buffers and committed. - ] -] -[ - [`is_body_writer`] - [`std::true_type`] - [ - An alias for `std::true_type` for `B`, otherwise an alias - for `std::false_type`. - ] -] -] - -[table Indirect Writer requirements -[[expression] [type] [semantics, pre/post-conditions]] -[ - [`X::is_direct`] - [`bool`] - [ - This static constant must be set to `false` to indicate this - is an indirect writer. - ] -] -[ - [`X a{m};`] - [] - [ - `a` is constructible from `m`. The lifetime of `m` is guaranteed - to end no earlier than after `a` is destroyed. The constructor - will be called after all headers have been stored in `m`, and - just before parsing bytes corresponding to the body for messages - whose semantics indicate that a body is present with non-zero - length. - ] -] -[ - [`a.init(ec)`] - [] - [ - This function is called once before any bytes corresponding - to the body are presented to the writer, for messages whose - body is determined by the end-of-file market on a stream, - or for messages where the chunked Transfer-Encoding is - specified. - If `ec` is set before returning, parsing will stop - and the error will be returned to the caller. - - ] -] -[ - [`a.init(v,ec)`] - [] - [ - This function is called once before any bytes corresponding - to the body are presented to the writer, for messages where - the Content-Length is specified. The value of `v` will be - set to the number of bytes indicated by the content length. - If `ec` is set before returning, parsing will stop - and the error will be returned to the caller. - ] -] -[ - [`a.write(s,ec)`] - [] - [ - The implementation calls this function with `s` containing - bytes corresponding to the body, after removing any present - chunked encoding transformation. - If `ec` is set before returning, parsing will stop - and the error will be returned to the caller. + This function is called to append the buffers specified by `b` + into the body representation. + If `ec` is set, the error will be propagated to the caller. ] ] [ [`a.finish(ec)`] [] [ - This function is called after all the bytes corresponding - to the body have been written to the buffers and committed. - If `ec` is set before returning, parsing will stop - and the error will be returned to the caller. + This function is called when no more body octets are remaining. + If `ec` is set, the error will be propagated to the caller. ] ] [ @@ -219,8 +89,52 @@ In the tables below: ] ] [note - Definitions for required [*Writer] member functions should be declared + Definitions for required [*BodyWriter] member functions should be declared inline so the generated code can become part of the implementation. ] +Exemplar: +``` +struct writer +{ + /** Construct the writer. + + @param msg The message whose body is to be stored. + */ + template + explicit + writer(message& msg); + + /** Initialization. + + Called once immediately before storing any buffers. + + @param content_length The content length if known, else `boost::none`. + + @param ec Set to the error, if any occurred. + */ + void + init(boost::optional content_length, error_code& ec); + + /** Store buffers. + + This is called zero or more times with parsed body octets. + + @param buffers The constant buffer sequence to store. + + @param ec Set to the error, if any occurred. + */ + template + void + put(ConstBufferSequence const& buffers, error_code& ec); + + /** Called when the body is complete. + + @param ec Set to the error, if any occurred. + */ + void + finish(error_code& ec); +}; +``` + [endsect] diff --git a/doc/quickref.xml b/doc/quickref.xml index 4a74892e..f5a78b3d 100644 --- a/doc/quickref.xml +++ b/doc/quickref.xml @@ -42,18 +42,12 @@ message message_parser request + request_parser response + response_parser serializer string_body - rfc7230 - - - ext_list - opt_token_list - param_list - token_list - Functions @@ -82,6 +76,13 @@ + rfc7230 + + ext_list + opt_token_list + param_list + token_list + Constants connection