Skip to content
Stand with Ukraine flag

Microservices Architecture

In a monolithic deployment, all ThingsBoard services run in a single process — you cannot scale MQTT transport independently from the Rule Engine, and upgrading any component requires restarting the entire instance. Microservices architecture (available since version 2.2) splits services into independent containers so each can be scaled, upgraded, and restarted without affecting the others.

Every service is stateless — all persistent state lives in the database and cache. Scaling any service means adding more replicas.

ServiceContainerDefault portScales byNotes
MQTT Transporttb-pe-mqtt-transport1883Adding replicas
HTTP Transporttb-pe-http-transport8080Adding replicas
CoAP Transporttb-pe-coap-transport5683Adding replicas
LwM2M Transporttb-pe-lwm2m-transport5685Adding replicas
SNMP Transporttb-pe-snmp-transportAdding replicas
ThingsBoard Nodetb-pe-node8080Adding replicas + Zookeeper partition rebalanceCore, Rule Engine, REST API
JS Executortb-pe-js-executorAdding instances20+ recommended for production
Web UItb-pe-web-ui8080Adding replicasNode.js / Express.js, proxies REST/WS to TB Node

Each transport protocol runs as a separate container. This allows you to scale MQTT independently from HTTP or CoAP based on your device fleet composition. Transport containers authenticate devices via the Core service and push messages into Kafka.

Key Kafka topics used by transports:

TopicDirectionPurpose
tb_transport.api.requestsTransport → CoreCredential validation, attribute fetch
tb_transport.api.responsesCore → TransportAuthentication results
tb_rule_engineTransport → Rule EngineTelemetry, attributes, RPC, lifecycle events
tb_coreRule Engine → CoreEntity lifecycle events, connectivity updates

See Message Queue for the full topic topology and tuning.

The central service that runs REST API, WebSocket server, Rule Engine, and Actor System. Instances are stateless in the traditional sense — all persistent state lives in the database and cache. The Actor System creates ephemeral per-entity actors (device actors, rule chain actors, tenant actors), but these are rebuilt from stored state when a node restarts. Zookeeper assigns entity partitions across nodes automatically.

Rule engine script nodes (filters, transformations) execute user-defined JavaScript in isolated Node.js sandboxes. Each JS Executor processes scripts sequentially and communicates with ThingsBoard Node via Kafka:

TopicDirectionPurpose
js.eval.requestsTB Node → JS ExecutorScript evaluation request
js.eval.responsesJS Executor → TB NodeEvaluation result
The Integration Executor is a separate microservice that runs platform integrations — pulling data from external systems (OPC-UA, SigFox, TheThingsNetwork, etc.) and pushing it into ThingsBoard via the rule engine. Each instance connects to Kafka and can run multiple integrations concurrently. Scaling is independent from the core ThingsBoard Node.

When a ThingsBoard Node pod goes down:

  1. Zookeeper detects the node is unresponsive (heartbeat timeout).
  2. Partitions reassign — the entity partitions owned by the failed node are redistributed across surviving nodes.
  3. Actors rebuild — surviving nodes create new actors for the reassigned entities, loading state from the database.
  4. Message processing resumes — surviving TB Nodes pick up Kafka partitions from the failed node. Transports are unaffected — they continue producing to the same Kafka topics regardless of which TB Nodes are consuming.

Message ordering is preserved throughout — since all messages for a given entity go to the same Kafka partition, they are processed in order regardless of which TB Node consumes them.

A microservices deployment requires several supporting services:

ServicePurposeHA configuration
Apache KafkaMessage queue between all services3-node cluster minimum
Redis / ValkeyCache for sessions, entity data, rate limitsSentinel or Cluster mode
ZookeeperService discovery and partition assignment3-node ensemble
HAProxy / nginxLoad balancer for HTTP, MQTT, CoAPActive-passive or active-active
PostgreSQLEntity storage, system dataStreaming replication or managed service
CassandraTime-series storage (Hybrid mode)3+ node cluster with RF=3
For full deployment instructions, see Installation.