forked from boostorg/mqtt5
Add a chapter on auto-reconnect and retry mechanism
Summary: related to T12804 Reviewers: ivica Reviewed By: ivica Subscribers: miljen, iljazovic Differential Revision: https://repo.mireo.local/D28198
This commit is contained in:
@ -111,7 +111,8 @@
|
||||
|
||||
[include 01_intro.qbk]
|
||||
[include 02_configuring_the_client.qbk]
|
||||
[include 03_examples.qbk]
|
||||
[include 03_auto_reconnect.qbk]
|
||||
[include 04_examples.qbk]
|
||||
|
||||
[include examples/Examples.qbk]
|
||||
|
||||
|
@ -44,7 +44,8 @@ In the event of a connection failure with one Broker, the Client switches to the
|
||||
* [*Offline buffering]: While offline, it automatically buffers all the packets to send when the connection is re-established.
|
||||
|
||||
[heading Example]
|
||||
The following example illustrates a simple scenario of configuring the Client and publishing an Application Message.
|
||||
The following example illustrates a simple scenario of configuring a Client and publishing a
|
||||
"Hello World!" Application Message with `QoS` 0.
|
||||
|
||||
[!c++]
|
||||
#include <iostream>
|
||||
|
97
doc/qbk/03_auto_reconnect.qbk
Normal file
97
doc/qbk/03_auto_reconnect.qbk
Normal file
@ -0,0 +1,97 @@
|
||||
[section:auto_reconnect Built-in auto-reconnect and retry mechanism]
|
||||
[nochunk]
|
||||
|
||||
The auto-reconnect and retry mechanism is a key feature of the __Self__ library.
|
||||
It is designed to internally manage the complexities of disconnects, backoffs, reconnections, and message retransmissions.
|
||||
These tasks, if handled manually, could lead to extended development times, difficult testing, and compromised reliability.
|
||||
By automating these processes, the __Self__ library enables users of the __Client__ to focus primarily
|
||||
on the application's functionality without worrying about the repercussions of lost network connectivity.
|
||||
|
||||
You can call any asynchronous function within the __Client__ regardless of its current connection status.
|
||||
If the __Client__ is offline, it will queue all outgoing requests and transmit them as soon as the connection is restored.
|
||||
In situations where the connection is unexpectedly lost mid-protocol flow,
|
||||
the __Client__ complies with the MQTT protocol's specified message delivery retry mechanisms.
|
||||
|
||||
The following example will showcase how the __Client__ internally manages various scenarios, including successful transmission, offline queuing,
|
||||
and connection loss retransmissions, when executing a request to publish a message with QoS 1.
|
||||
Note that the same principle applies to all other asynchronous functions within the __Client__
|
||||
(see /Completion condition/ under [refmem mqtt_client async_publish], [refmem mqtt_client async_subscribe], [refmem mqtt_client async_unsubscribe],
|
||||
and [refmem mqtt_client async_disconnect]).
|
||||
|
||||
// Publishing with QoS 1 involves a two-step process: sending a PUBLISH message to the Broker and awaiting a PUBACK (acknowledgement) response.
|
||||
// The scenarios that might unfold include:
|
||||
// a) The Client sends the PUBLISH message immediately.
|
||||
// b) If the Client is offline when attempting to publish, it queues the PUBLISH message and sends it
|
||||
// as soon as the connection is re-established.
|
||||
// c) Should the Client lose connection after sending the PUBLISH message but before receiving a PUBACK,
|
||||
// it will automatically retransmit the PUBLISH message once connectivity is restored.
|
||||
|
||||
client.async_publish<async_mqtt5::qos_e::at_least_once>(
|
||||
"my-topic", "Hello world!",
|
||||
async_mqtt5::retain_e::no, async_mqtt5::publish_props {},
|
||||
[](async_mqtt5::error_code ec, async_mqtt5::reason_code rc, async_mqtt5::puback_props props) {
|
||||
// This callback is invoked under any of the following circumstances:
|
||||
// a) The Client successfully sends the PUBLISH packet and receives a PUBACK from the Broker.
|
||||
// b) The Client encounters a non-recoverable error, such as a cancellation or providing invalid parameters
|
||||
// to async_publish, which prevents the message from being sent.
|
||||
}
|
||||
);
|
||||
|
||||
[section:sentry ]
|
||||
|
||||
|
||||
[endsect] [/sentry]
|
||||
|
||||
[section:cons Considerations and limitations]
|
||||
|
||||
The integrated auto-reconnect and retry mechanism greatly improves the user experience
|
||||
by simplifying complex processes and ensuring continuous connections.
|
||||
However, it is important to be mindful of certain limitations and considerations associated with this feature.
|
||||
|
||||
[heading Delayed handler invocation]
|
||||
|
||||
During extended periods of __Client__ downtime, the completion handlers for asynchronous functions,
|
||||
such as those used in [refmem mqtt_client async_publish], may face considerable delays before invocation.
|
||||
This can result in users being left in the dark regarding the status of their requests due to the absence of prompt feedback on the initiated actions.
|
||||
|
||||
[heading Concealing configuration-related issues]
|
||||
|
||||
The __Client__ will always try to reconnect to the Broker(s) regardless of the reason why the connection was previously closed.
|
||||
This is desirable behaviour when the connection gets dropped due to underlying stream transport issues,
|
||||
such as when a device connected to the network loses its GSM connectivity.
|
||||
|
||||
However, the connection may be closed (or never established) for other reasons,
|
||||
which are typically related to misconfiguration of broker IPs, ports, expired or incorrect TLS, or MQTT-related errors,
|
||||
such as trying to communicate with a Broker that does not support MQTT 5.
|
||||
In these cases, the __Client__ will still endlessly try to connect to Broker(s), but the connection will never succeed.
|
||||
|
||||
The most challenging problem here is that users of the __Client__ do not get informed in any way that the connection cannot be established.
|
||||
So, if you make a typo in the Broker's IP, run the __Client__, and publish some message, the [refmem mqtt_client async_publish] callback will never be invoked,
|
||||
and you will not "catch" the error or detect the root cause of the issue.
|
||||
|
||||
The possible alternative approach, where the __Client__ would return something like "unrecoverable error"
|
||||
when you try to publish a message to a misconfigured Broker, would have a terrible consequence if the Broker itself is misconfigured.
|
||||
For example, suppose someone forgets to renew the TLS certificate on the Broker.
|
||||
The connection will be broken in that case, and the __Client__ would report an "unrecoverable error" through the [refmem mqtt_client async_publish] method.
|
||||
Now, the expired TLS certificate on the Broker is most probably a temporary issue,
|
||||
so it is natural that the __Client__ would try to reconnect until the certificate gets renewed.
|
||||
But, if the __Client__ stops retrying when it detects such an "unrecoverable error," then the decision of when to reconnect would be left to the user.
|
||||
|
||||
By design, one of the main functional requirements of the __Client__ was to handle reconnection steps automatically and correctly.
|
||||
If the decision for reconnection were left to the user, then the user would need to handle all those error states manually,
|
||||
which would dramatically increase the complexity of the user's code, not to mention how difficult it would be to cover all possible error states.
|
||||
|
||||
The proposed approach for detecting configuration errors in the __Client__ is to use some simple logging facility during development.
|
||||
Log lines should be injected directly into the __Client__ code (typically in the connect_op.hpp file), and logs would uncover misconfigurations (if any).
|
||||
|
||||
[heading Increased resource consumption]
|
||||
|
||||
The __Client__ is designed to automatically buffer requests that are initiated while it is offline.
|
||||
During extended downtime or when a high volume of requests accumulates, this can lead to an increase in memory usage.
|
||||
This aspect is significant for devices with limited resources, as the growing memory consumption can impact their performance and functionality.
|
||||
|
||||
[/ TODO: link to the debugging the client chapter ]
|
||||
|
||||
[endsect] [/cons]
|
||||
|
||||
[endsect] [/auto_reconnect]
|
Reference in New Issue
Block a user