MQTT continues to grow in popularity and implementation within the industrial automation community. For those who are still new and learning about this protocol, MQTT is a light-weight transfer protocol built around a publish/subscribe architecture. In simple terms, MQTT can be compared to a mail envelope, providing structure and a common appearance with a supported delivery system, while the actual content of that envelope can vary widely.
Because of the ambiguity of message content, implementations of MQTT can be utilized for various needs across segments in our industry, though the varying message content presents its own issues that are outside the scope of this article. There are multiple solutions for providing structure and standards around the message content that we will discuss in future posts.
This blog will primarily focus on the reliability of message delivery based on the concept of Quality of Service (QoS). We will be utilizing the Cogent DataHub to showcase these options, as it can act as both the MQTT Broker & Client. It is important to note that this information can be applied to various MQTT implementations, including other Software Toolbox product offerings with MQTT, such as OPC Router or the TOP Server MQTT client driver.
What is Quality of Service (QoS)?
Commonly referred to as QoS, this mechanism is built into the protocol to determine the overhead utilized by message delivery resulting in various levels of delivery accuracy. The available levels of QoS correspond to increasing levels of reliability, giving users finer control to meet the wide range of implementation requirements and environments.
QoS 0 - At most once:
- QoS 0 provides the lowest level of message assurance. Once the message is sent, there is no requirement for confirmation, acknowledgement, or retry for dropped messages. It's the fastest mode but offers no guarantee of delivery, which is why it is known as ‘fire and forget’.
QoS 1 - At least once:
- Utilizing QoS 1 ensures the message is delivered at least once to the subscriber by expecting an acknowledgment (ACK). If this acknowledgment is not received within a specified timeout period, the message is retransmitted. This retransmission will continue until the message is acknowledged or the connection timeout is reached. This level introduces a higher level of assurance than QoS 0 but may result in duplicate messages. This also can result in higher CPU utilization due to handing acknowledgements and retransmissions.
QoS 2 - Exactly once:
- QoS 2 guarantees a message is delivered exactly once to the receiver by implementing a four-step handshake process. The publisher and subscriber engage in a handshake to ensure messages are not lost or duplicated. This level provides the highest level of reliability but results in increased overhead due to the additional handshake process. That additional overhead results in a reduction of at least 50% in total message throughput capacity when compared to QoS 0.
In the Publish-Subscribe scheme of MQTT, QoS is requested by the Publisher and negotiated by both the Client and Broker. However, it’s also important to consider the QoS impacts of other MQTT clients subscribed to the same broker where data is being published. This matters because an overall QoS will be determined by the broker from the resulting lowest QoS level between clients. Therefore, a temperature sensor publishing to a broker with QoS 2 and an MQTT software client subscribing to these topics with QoS 0 will result in these messages utilizing QoS 0 only due it being the lowest level.
Why does QoS matter to me?
When integrating a MQTT solution, it’s pertinent to consider the necessary actions of connected devices & the existing networking infrastructure. Scenarios that require sending control commands to devices that operate critical functions require higher reliability when compared to sensors that rapidly publish updates. Reliable QoS levels can minimize downtime, reducing interruptions in production, thus bolstering operational efficiency by ensuring message delivery. However, the added overhead could impact lower-powered networks if additional protocol “chatter” is implemented unnecessarily.
QoS requires a delicate balance with important consideration for each implementation on both the OT and IT side. Collaboration and testing with tools that provide performance monitoring will aid in long-term project success.
What QoS should I use?
Now that we’ve covered each level of QoS and why it’s important for your MQTT integrations, which service level should you choose?
QoS 0 will provide the highest level of efficiency and would be suggested for use cases where occasional missed messages are acceptable. For example, non-critical sensor data for humidity in a plant that would not require any real-time response to a change would be a good candidate for publishing with level 0.
For uses when delivery is required, QoS 1 ensures this without introducing as large of overhead when compared to QoS 2. Devices used to control specific devices in an assembly, such as enabling or disabling, would be suggested to utilize level 1 at a minimum. Controlling for potential duplicate values being received should be considered; segmenting tags to control the on/off state may help in this situation.
When absolute confidence in the delivery of commands in a specific order is required, QoS 2 must be used. For advanced and critical operation control that demands precision, meticulous management is essential and requires the highest level of assurance and elevated network overhead.
Ideally, a mixed implementation of QoS to segment topics based on the considerations above would allow strategic control when considering efficiency vs. reliability. This combination of utilizing each QoS level for its maximum benefit can be beneficial for optimizing resource utilization, meeting diverse reliability needs, and accommodating varying use cases. However, it's crucial to weigh the advantages against the added complexity and ensure proper management and maintenance to derive maximum benefit from this approach.
How does QoS Impact Throughput?
To further explore the usage of QoS we have configured a test involving two instances of the Cogent DataHub. DataHub 1 is employing the DataHub Smart MQTT Broker feature on a virtual machine, while Datahub 2 running on my local machine is acting as the MQTT Client which is responsible for publishing data. For this scenario, we have configured three different groups, each consisting of 1,000 tags collected from an OPC UA Server. These tags are updating every 100ms, simulating a rapid stream of sensor data.
Utilizing the built in Connection Viewer from DataHub, we can access valuable statistics and status attributes of the various plug-ins, including MQTT. At the bottom, there are options for Total, Checkpoint, and Performance selections. We will be utilizing the Performance view to monitor the rate of Sent/Received packets from the Broker and Client sides.
In the first run below, I am publishing data with a QoS level of 0 to a designated topic in the Broker. Upon reviewing the Received statistic, we can see the number of messages we’re receiving from the publishing client per second is roughly 9,000 a second.
Now on the Client side, we can see the additional details on the sent packets, which are nearly identical to the receival rate on the Broker side. While we do have a few dropped messages, which is not uncommon for QoS 0 due to lack of delivery assurance, newer packets are updating the same tag and are being sent successfully.
Due to the rapid change in tag updates for their test there may be a small difference in the amount of sent and received being captured as closely as possible, this is why we could encounter more packets Received than Sent in a particular second.
Continuing the test, we’ve adjusted the publishing QoS to level 1 and level 2 respectively to compare the outcomes. Below is a table summary of the results showing the captured statistics of both Sent & Received packets for each level.
The rate of received data rate drastically dropped during this test when QoS was increased. QoS 2 shows the lowest retrieval rate compared to the other levels, as we anticipated because of the additional overhead. Given the nature of this simulation data, which is constantly updating, utilizing QoS 2 significantly slows the availability of new values.
This test is a purposeful exaggeration, through using some under-resourced machines and volume + data rate of change, of the performance difference between the QoS levels when sending data. This further reinforces the requirement of understanding the purpose of each publishing client and their networking environments when integrating MQTT especially in large scale applications. If you are considering using one of our solutions, our technical team can assist your team in discussions around scale and performance to get the best results with the technology and our tools.
We hope this blog testing has provided meaningful insight into the concept of QoS while showcasing the valuable statistic tracing available in Cogent DataHub that made this possible. Whether you’re looking to configure a MQTT Broker, Client or Both, you can follow our Cogent DataHub On Demand Learning Resources to quickly get started with a free trial download.
Have Questions? Contact us with your questions and we'll be glad to help.