3M msg/sec throughput with a single TBMQ node

In the context of MQTT brokers, optimal performance under high workloads is critical. This article explores a performance test of TBMQ demonstrating its ability to handle 3 million messages per second with single-digit millisecond latency — using a single node.

TBMQ single-node 3M msg/sec performance test architecture diagram

Test methodology

A single TBMQ node was deployed within an EKS cluster alongside 3 Kafka nodes and an RDS instance.

Publishers: 100 clients, each publishing 10 msg/sec to their own topic following the pattern CountryCode/City/ID. Message size: ~66 bytes.

Subscribers: 3,000 clients, each subscribing to CountryCode/City/# to receive all messages from all publishers. With 100 publishers × 10 msg/sec × 3,000 subscribers = 3M messages per second total throughput.

The test ran for 30 minutes to verify TBMQ can sustain this load without performance degradation or resource exhaustion.

Hardware used

Service	TBMQ	AWS RDS (PostgreSQL)	Kafka
Instance type	m7a.8xlarge	db.m6i.large	m7a.large
vCPU	32	2	2
Memory (GiB)	128	8	8
Storage (GiB)	10	30	50
Network (Gbps)	12.5	12.5	12.5

Test summary

Publishers	Subscribers	Msg/sec/publisher	Total throughput	QoS	Payload	TBMQ CPU	TBMQ memory
100	3,000	10	3M msg/s	0	66 bytes	54%	75 GiB

Latency:

Msg latency avg	Msg latency 95th
7.4 ms	11 ms

At 54% CPU utilization, TBMQ demonstrates substantial remaining processing capacity, suggesting it can efficiently handle even higher workloads and message delivery peaks.

Alternative instance: tests on m7a.4xlarge (16 vCPU, 64 GiB) achieved 14.2 ms average latency at 90% CPU, highlighting TBMQ’s flexibility across different instance types.

Running the test

The test agent is a cluster of performance test nodes (runners) supervised by an orchestrator. For this test, the agent consisted of 1 publisher pod, 6 subscriber pods, and 1 orchestrator pod — each on a separate AWS EC2 instance.

AWS EC2 instances deployed for the 3M single-node test

Subscriber clients set up their subscriptions; publishers ran a warm-up phase. Once all runners reported ready to the orchestrator, message publishing began. Monitoring was performed using JMX, htop, Kafka UI, and AWS CloudWatch.

AWS EC2 TBMQ monitoring during 3M single-node test

JMX TBMQ monitoring during 3M single-node test

TBMQ built-in monitoring charts during 3M single-node test

How to repeat the test

Refer to the AWS cluster installation guide for deployment instructions.

Conclusion

TBMQ successfully processed 3M messages per second with an average latency of 7.4 ms on a single node, confirming its position as a scalable MQTT broker ready for demanding fan-out workloads. Follow the project on GitHub for further performance results.