Architecture

This page explains Armada's system architecture, including its components, event-sourcing design, and how jobs flow through the system.

System Overview

Armada is a multi-Kubernetes cluster batch job scheduler designed to handle massive-scale workloads. An Armada deployment consists of:

Worker clusters: One or more Kubernetes clusters containing nodes for running jobs
Control plane: Components responsible for job submission and scheduling
Executors: One executor per worker cluster that communicates between the control plane and Kubernetes

The control plane components can run anywhere accessible by executors (same cluster, different cluster, or non-Kubernetes nodes). Each executor typically runs within its worker cluster.

Deployment Architecture

Event-Sourcing Architecture

Armada uses an event-sourcing (data stream) architecture for high scalability and resilience. The system has two main components:

Log-based message broker: Stores state transitions durably and in order (the source of truth)
Materialized views: Independent databases that derive their state from the log

Key Properties

Durability: Messages are stored durably and survive node failures
Ordering: All subscribers see messages in the same order
Eventual consistency: Databases are updated independently, ensuring eventual consistency
Replayability: Any database can be reconstructed by replaying messages from the log

Implementation

Armada uses Apache Pulsar as the log implementation. This approach provides:

Resilience to load bursts: The log buffers state transitions, isolating downstream components
Simplicity and extensibility: New views can be added by subscribing to the log
Consistency guarantees: Databases eventually converge to the same state

Control Plane Components

The Armada control plane consists of several subsystems:

Submit API

Accepts job submissions from clients
Handles authentication and authorization
Validates job specifications
Publishes job submission events to the log

Streams API

Enables clients to subscribe to job state updates via event streams
Provides a single connection per job set for scalable event delivery
Isolates clients from internal system messages

Scheduler

Maintains a global view of system state
Makes preemption and scheduling decisions
Balances fairness, throughput, and timeliness
Publishes scheduling decisions to the log

The scheduler runs periodically and considers:

Available resources across all worker clusters
Queue priorities and fair share
Job priorities and urgency
Gang scheduling constraints

Executors

Each executor is responsible for one Kubernetes worker cluster:

Queries the scheduler for jobs to run
Creates Kubernetes resources (pods, services, etc.) via the Kubernetes API
Monitors job execution
Reports job completion and resource availability back to the scheduler

Lookout (Web UI)

Provides a web interface for viewing job state
Maintains its own materialized view by reading from the log
Displays jobs, queues, and system metrics

Job Lifecycle Flow

Here's how a job flows through the system:

Submission: Client submits job to Submit API → API validates and publishes to log
Queued: Scheduler receives job from log and stores it in queue
Scheduled: Scheduler assigns job to a cluster/node and publishes scheduling decision to log
Leased: Log processor writes scheduling decision to executor database
Pending: Executor reads from database, creates Kubernetes resources
Running: Kubernetes starts containers
Completed: Executor reports completion → Scheduler publishes completion event to log

Consistency and Reliability

Armada ensures consistency across components using ordered idempotent updates:

State transitions are published to the log in order
Each component processes messages independently and idempotently
Components track the most recent message processed
After failures, components can recover by replaying from their last checkpoint

This approach provides:

Eventual consistency: All databases eventually reflect the same state
Fault tolerance: Components can recover independently
Scalability: No distributed transactions needed

Storage and State

Armada uses multiple databases for different purposes:

Scheduler database: Stores jobs to be scheduled (PostgreSQL)
Lookout database: Stores jobs for UI display (PostgreSQL)
Executor databases: Interface between scheduler and executors

All databases are materialized views derived from the log, ensuring they can be reconstructed if needed.

On this page