← Back to Autonomy
The State-Space Notebook Vol. I · Linear Systems · an interactive field guide

On Controllability, Observability,
and the Art of Estimating What You Cannot See.

Five short chapters on the structural properties of linear dynamical systems — from the algebraic tests that determine whether a state can be reached or inferred, through to the stochastic machinery that Rudolf Kálmán gave us in 1960. Each chapter is a working simulator. Turn the knobs. Watch the trajectories bend.

Author Majid Mazouchi
Form 5 interactive chapters
Math Live-rendered LaTeX
Notation $\dot{x}=Ax+Bu,\ y=Cx$
Prerequisite Linear algebra, some ODEs
Chapter One

Controllability — can you get there from here?

A system is controllable when any initial state can be steered to any desired final state in finite time using an admissible input. If not, some directions in state-space are forever beyond the reach of your actuators.

Consider the linear time-invariant system $\dot{x} = A x + B u$, with $x \in \mathbb{R}^n$ and $u \in \mathbb{R}^m$. The question "where can we drive $x$?" turns out to have a purely algebraic answer. The controllability matrix collects the directions we can excite, directly and through repeated application of the dynamics:

$$ \mathcal{C} = \begin{bmatrix} B & AB & A^2B & \cdots & A^{n-1}B \end{bmatrix} \in \mathbb{R}^{n \times nm}. $$

The pair $(A, B)$ is controllable if and only if $\mathrm{rank}(\mathcal{C}) = n$. The Cayley–Hamilton theorem guarantees we need no more than $n-1$ powers of $A$ — anything further is linearly dependent on what we already have.

Controllability is a property of the pair $(A, B)$ alone. It does not depend on any particular input, the cost of control, or how fast we need to get there — only whether, in principle, the reachable subspace spans the entire state-space.

An equivalent view — the PBH test

The Popov–Belevitch–Hautus test reformulates controllability mode-by-mode: $(A,B)$ is controllable iff $\mathrm{rank}\!\begin{bmatrix}\lambda I - A & B\end{bmatrix} = n$ for every eigenvalue $\lambda$ of $A$. A mode is uncontrollable precisely when the left eigenvector $w^\top$ satisfies $w^\top B = 0$ — the input channel cannot "see" that direction.

Planar system · $n=2$

━━ span of $B$    ━━ span of $AB$    shaded reachable set

A=
B=

Controllability matrix

rank 𝒞
det 𝒞
status

Try the "Uncontrollable" preset. The matrix $A$ is diagonal, meaning the two state variables evolve independently, and $B$ only pokes at the first one. The second mode is structurally deaf to every input you could apply — the controllability matrix drops rank, and the reachable set collapses to a one-dimensional line.

Stabilizability — a weaker cousin

If every unstable mode is controllable, we can still stabilize the system even if some stable modes are uncontrollable. That looser condition is called stabilizability. In practice this is often all we need: we don't mind that some benign, already-stable dynamics are out of our reach.

REF · Kalman, R.E. (1960). "On the general theory of control systems." IFAC World Congress. — the paper that introduced controllability as a fundamental structural property.
Chapter Two

Observability — what can you infer from what you measure?

A system is observable when the entire internal state can be reconstructed from a finite record of inputs and outputs. Otherwise, distinct initial conditions produce identical measurements, and no amount of cleverness will tell them apart.

Add an output equation: $y = C x$. Given the input $u(t)$ and the measurement $y(t)$ over a window, can we recover the state $x(t)$? The question is, remarkably, the precise mathematical dual of controllability — a correspondence first noticed by Kalman, and one of the most elegant in engineering.

$$ \mathcal{O} = \begin{bmatrix} C \\ CA \\ CA^2 \\ \vdots \\ CA^{n-1} \end{bmatrix} \in \mathbb{R}^{np \times n}. $$

The pair $(A, C)$ is observable iff $\mathrm{rank}(\mathcal{O}) = n$. The null space of $\mathcal{O}$ — the unobservable subspace — is the set of initial conditions that produce zero output for all time. Two states that differ only in the unobservable subspace are indistinguishable from any finite record of $y$.

Duality: $(A, C)$ is observable iff $(A^\top, C^\top)$ is controllable. Every theorem about one concept has an exact mirror image for the other. Design a controller, you've implicitly designed an observer. It is a deep, useful symmetry.

Planar system · two trajectories, one measurement

━━ trajectory from $x_0^{(1)}$    ━━ trajectory from $x_0^{(2)} = x_0^{(1)} + v_\mathrm{unobs}$    ┅┅ $y(t)$

A=
C=

Observability matrix

rank 𝒪
det 𝒪
status

Try "Decoupled · unobservable." $A$ is diagonal; $C$ only sees the first state. Two initial conditions that differ only in $x_2$ produce identical outputs forever — the two red and teal trajectories leave together by different paths and map to the same yellow measurement curve. No algorithm you can invent will separate them.

The observable canonical decomposition

If the system is not fully observable, a similarity transform splits the state into observable and unobservable parts. The unobservable part has no effect on $y$; it lives in a subspace that the measurement map is structurally blind to. The observable part is what a state estimator can actually reconstruct. We will visualize this decomposition next.

Chapter Three

Partial observability & the Kalman decomposition

Real systems are rarely cleanly controllable or observable end-to-end. Kalman's structure theorem splits the state into four independent parts — a map of exactly what you can see, what you can touch, and what lives outside both.

For any linear system, there exists a similarity transformation $T$ such that the state decomposes into four orthogonal pieces, each with a distinct relationship to the input and output. This is the Kalman canonical decomposition, and it is the clearest way to see why the transfer function alone — input to output — cannot tell you everything about a system.

The four subspaces

Only the controllable & observable part appears in the transfer function $G(s) = C(sI-A)^{-1}B$. Everything else is hidden from the input–output map.

Legend & meaning

C & O — controllable and observable. Visible in transfer function. This is the minimal realization.
C & Ō — controllable but unobservable. You can drive it; you cannot see where it goes. Internal dynamics hidden from $y$.
C̄ & O — uncontrollable but observable. You see it; you cannot steer it. Typically a disturbance mode.
C̄ & Ō — neither. Completely disconnected from the input–output map. Ghost dynamics.

Canonical form

After the similarity transform:

$$\tilde{A} = \begin{bmatrix} A_{co} & 0 & * & 0 \\ * & A_{c\bar{o}} & * & * \\ 0 & 0 & A_{\bar{c}o} & 0 \\ 0 & 0 & * & A_{\bar{c}\bar{o}} \end{bmatrix},\ \tilde{B} = \begin{bmatrix} B_{co} \\ B_{c\bar{o}} \\ 0 \\ 0 \end{bmatrix},\ \tilde{C} = \begin{bmatrix} C_{co} & 0 & C_{\bar{c}o} & 0 \end{bmatrix}$$

Stabilizability & detectability

For practical control design we ask less: every unstable uncontrollable mode must not exist (stabilizability), and every unstable unobservable mode must not exist (detectability). These two conditions, together with a state-feedback controller and a state estimator, are exactly what the Separation Principle requires. Stabilizability + detectability is the lightest pair of conditions under which a linear system admits a stabilizing output-feedback controller. Nothing is free, but this is as close as it gets.

A transfer function only reveals the controllable–and–observable part of a state-space model. Two very different plants — one with hidden instabilities, one without — can have the identical $G(s)$. This is why input-output models can be dangerously misleading for safety-critical systems.
Chapter Four

The Luenberger observer — or, how to guess the state correctly

Given a model of the plant and a measurement of its output, a Luenberger observer synthesizes a clean estimate of the full state. The trick is to correct the model in real time by the measurement residual, weighted by a carefully chosen gain.

Suppose we want $\hat{x} \to x$. The most natural candidate is to integrate a copy of the plant: $\dot{\hat{x}} = A\hat{x} + Bu$. This is called the open-loop observer, and it works exactly when the model is perfect and the initial condition is known — which is to say, never. Any mismatch grows with the plant's dynamics, and for unstable systems the estimation error diverges.

Luenberger's 1964 insight was to feed back the output residual $y - C\hat{x}$ through an observer gain $L$:

$$ \dot{\hat{x}} = A\hat{x} + Bu + L\,(y - C\hat{x}). $$

Defining the estimation error $e = x - \hat{x}$, a quick subtraction gives the error dynamics:

$$ \dot{e} = (A - LC)\,e. $$

The error is an autonomous linear system with dynamics $A - LC$. If $(A, C)$ is observable, we can place the eigenvalues of $A - LC$ anywhere we like — typically a few times faster than the closed-loop poles of the controller, so estimation converges before feedback "notices." Pushing the observer poles deeper into the left half-plane gives faster convergence at the cost of sensitivity to measurement noise. This tradeoff is the beating heart of every observer design.

The Separation Principle: for a linear system, you can design a stabilizing state-feedback law $u = -K\hat{x}$ as if you had access to $x$, and you can design the observer gain $L$ independently. The closed-loop poles are the union of the controller poles (eigenvalues of $A-BK$) and the observer poles (eigenvalues of $A-LC$). They do not interact.

Plant state vs. observer estimate · time response

━━ $x_1$ (true)    ━━ $\hat{x}_1$ (estimate)    ┅┅ $x_2$ (true)    ┅┅ $\hat{x}_2$ (estimate)

━━ $\|e(t)\|$ — estimation error norm

Plant (2nd-order oscillator)

A = [[0, 1], [-2, -0.4]]
B = [[0], [1]],   C = [1, 0]

Observer pole locations

pole 1 — $\lambda_1$ -3.0
pole 2 — $\lambda_2$ -4.0
initial estimate error 2.0
L — gain
[—]
settling · 2%
ratio ωobsplant

Push the poles farther left (more negative) to make the error decay faster — notice the norm curve plunges to zero. Now imagine the output contains noise. Very fast observer poles mean high entries in $L$, which amplify any measurement noise into the estimate. There is no free lunch; fast estimators are loud estimators. This is exactly the tension that the Kalman filter resolves optimally.

Ackermann's formula, in one line

If the desired observer characteristic polynomial is $\alpha(s) = \prod_i (s - \lambda_i^{\mathrm{des}})$, then $L = \alpha(A)\, \mathcal{O}^{-1} e_n$, where $e_n$ is the last standard basis vector of $\mathbb{R}^n$. It is the dual of the better-known state-feedback Ackermann formula. In practice we use place() in MATLAB or scipy.signal.place_poles, which handles numerical conditioning better.

Chapter Five

The Kalman filter — optimality under Gaussian noise

When the process and measurement noises are known (zero-mean, white, Gaussian), there is a unique observer gain $L$ that minimizes the steady-state error covariance. It is not a heuristic. It is the answer.

Luenberger's observer leaves open how to choose $L$. Make it fast and noise appears on the estimate. Make it slow and the estimate lags. The Kalman filter is the observer design that trades these two evils off in a provably optimal way, given statistical models of the disturbances.

Consider the discrete-time system with process noise $w_k \sim \mathcal{N}(0, Q)$ and measurement noise $v_k \sim \mathcal{N}(0, R)$:

$$ x_{k+1} = F x_k + G u_k + w_k, \quad y_k = H x_k + v_k. $$

The filter alternates a prediction (push the estimate through the model) and an update (correct using the measurement). The covariance $P_k$ tracks our uncertainty about $\hat{x}_k$:

Prediction step $$\hat{x}_{k+1|k} = F \hat{x}_{k|k} + G u_k, \qquad P_{k+1|k} = F P_{k|k} F^\top + Q.$$ Update step $$K_{k+1} = P_{k+1|k} H^\top \big(H P_{k+1|k} H^\top + R\big)^{-1},$$ $$\hat{x}_{k+1|k+1} = \hat{x}_{k+1|k} + K_{k+1}\big(y_{k+1} - H\hat{x}_{k+1|k}\big),$$ $$P_{k+1|k+1} = (I - K_{k+1} H)\, P_{k+1|k}.$$

The Kalman gain $K_{k+1}$ is the optimal weighting between trusting the model and trusting the measurement. When $R \ll H P H^\top$ (measurement is precise), $K \to H^{-1}$ and the update snaps to the measurement. When $R \gg H P H^\top$ (measurement is noisy), $K \to 0$ and the filter trusts the model. The filter performs this tradeoff automatically, per sample, per state.

If the system is linear, the noises are Gaussian and independent, and the models $(F,G,H,Q,R)$ are known, the Kalman filter is the minimum mean-square error estimator — over all estimators, not just linear ones. Its elegance is that this sweeping optimality result is implemented by a handful of matrix operations per sample.

Tracking a noisy 1-D object · position vs. time

━━ $x_1$ (true position)    • $y_k$ (noisy measurement)    ━━ $\hat{x}_1$ (KF estimate)    ░░ $\pm 2\sigma$ confidence

━━ $K_{k,1}$ (position-gain)    ━━ $K_{k,2}$ (velocity-gain) — converging toward steady-state values.

Model · constant velocity

x = [pos, vel]ᵀ
F = [[1, Δt], [0, 1]]
H = [1, 0],   Δt = 0.1 s

Process noise · $q$

σprocess (how much the model drifts) 0.20

Measurement noise · $r$

σmeas (sensor stdev) 1.00
K
RMS error
RMS meas

Notice: The filter's RMS error is typically a fraction of the raw measurement RMS — this is the filter's "free" information gain from fusing measurements over time through the model.

The steady-state / Riccati view

As $k \to \infty$, $P_k$ converges (for detectable $(F,H)$ and stabilizable $(F, Q^{1/2})$) to the unique positive-definite solution of the discrete algebraic Riccati equation:

$$ P_\infty = F P_\infty F^\top + Q - F P_\infty H^\top (H P_\infty H^\top + R)^{-1} H P_\infty F^\top. $$

The Kalman gain converges to $K_\infty = P_\infty H^\top (H P_\infty H^\top + R)^{-1}$, and the filter becomes — structurally — a Luenberger observer with an optimally chosen $L = F K_\infty$. In practice, many production filters run at steady-state gains computed offline: lower CPU, identical asymptotic performance, much simpler ASIL-relevant code.

What the filter is and is not

The Kalman filter assumes a linear model with known, Gaussian, white noise covariances. Real systems usually break at least one of these: model nonlinearity (handled by the EKF or UKF), non-Gaussian noise (handled by particle filters or robust estimators), and coloured or correlated noise (handled by state augmentation). But start here — the linear, Gaussian Kalman filter remains the beginning of every serious estimation architecture, and an astonishing number of production systems never need more.

REF · Kalman, R.E. (1960). "A new approach to linear filtering and prediction problems." Journal of Basic Engineering. · Anderson & Moore, Optimal Filtering (1979) · Simon, Optimal State Estimation (2006).