Topic · 06
Reinforcement Learning
Agents, rewards, policies. Where control theory and ML meet.
-
Apr 20, 2026RLThe most pragmatic post in the field guide — the hyperparameters that almost always work, the order to debug a policy that won't learn, and the silent failures that cost days be...
-
Apr 20, 2026RLAlgorithms are maybe 30% of a successful RL robotics project. The other 70% is engineering: reward design, observation and action spaces, sim-to-real, learning from logged data,...
-
Apr 20, 2026RLIf you're a control engineer, this is the RL section written for you. Model-based RL learns dynamics and plans against them — 100× more sample-efficient than model-free on real ...
-
Apr 20, 2026RLEvery RL algorithm is secretly two algorithms sharing a body — one that exploits what it knows, one that explores for what it doesn't. This post covers the exploration methods t...
-
Apr 20, 2026RLFrom REINFORCE to PPO in one post — the policy gradient theorem, why its variance is ruinous by default, and the three tricks (baselines, critics, GAE) that make it work in prac...
-
Apr 20, 2026RLThe Bellman equations say what a value function satisfies; temporal-difference learning says how to estimate it from samples. One update rule, two algorithms (SARSA and Q-learni...
-
Apr 20, 2026RLEvery RL algorithm starts from the same place: write down the MDP. This post covers the formalism a control engineer actually needs — state, action, reward, the discount factor,...
-
Apr 20, 2026RLA practitioner-oriented field guide to reinforcement learning for control and robotics. Start here: the RL family tree, a five-question algorithm selector, and the rules of thum...