Imagine a small robot standing in the corner of a room. It wants to reach a charging dock at the opposite corner. Between them lies a jumble of chairs, table legs, a sleeping cat, and a rug it must not slide onto. The robot needs to choose a sequence of motions (a path) that is safe, feasible, and (ideally) short.
Formally, path planning asks: given a map of the world, a starting configuration qstart, and a goal qgoal, find a continuous sequence of configurations connecting them that avoids obstacles and respects the vehicle's motion constraints.
That word (configuration) is the first subtlety. For a point robot, configuration is just (x, y). For a car, it is (x, y, θ) because orientation matters: a car cannot slide sideways. For a robot arm, it is the vector of every joint angle. The space of all configurations is called the configuration space (or C-space), and the planning problem really lives in there, not in the physical room.
A recurring distinction: global planners look at the whole map and produce a complete path; local planners react to what is immediately around the robot. A practical autonomous system almost always uses both: a global planner chooses the overall route, a local planner adapts it in real time as the world changes or new obstacles appear.
The algorithms we study fall naturally into three great families, distinguished by how they represent the free space. Everything that follows in this guide is a variation on one of these ideas.
Discretize the world into cells or a graph, then search.
Best for: structured, known, bounded environments.
Sprinkle random samples through configuration space & connect them.
Best for: high-dimensional, complex spaces (manipulators, cars).
Compute smooth curves or time-parametrized trajectories directly.
Best for: producing executable motions respecting dynamics.
These families are not rivals; they are layers. A self-driving car, for instance, may use a graph search over a road network for routing (family ①), sample trajectories in a Frenet frame along that route (family ③), then use a reactive local planner like DWA to avoid a dog that just ran into the street (family ③ again, but local). Let us meet each family in turn.
Slice the world into a grid of square cells. Mark each cell as free or blocked. Now the robot lives on a graph: each free cell is a node, neighbors are connected by edges. Suddenly, path planning becomes a problem from a textbook: the shortest path in a graph. This is the oldest and still most widely used approach.
Dijkstra's algorithm expands outward from the start like a wavefront, always choosing next the unvisited cell with the smallest cumulative cost. It is guaranteed to find the shortest path and is beautifully simple. Its weakness is that it is uninformed: it will happily expand cells in the wrong direction because it does not know where the goal is.
A⋆ adds a single trick: a heuristic h(n) estimating the remaining distance from cell n to the goal. Instead of expanding by accumulated cost g(n) alone, it expands by f(n) = g(n) + h(n). When the heuristic is admissible (never overestimates the true remaining cost), A⋆ still finds the optimal path, but typically explores dramatically fewer cells.
Setting h(n)=0 recovers Dijkstra exactly. Setting h(n)=Euclidean distance gives classic A⋆ on a grid. Setting h(n)= something that over-estimates breaks the optimality guarantee but often speeds things up further.
N) in the number of cells, a 1 cm resolution map of a house is fine; a 1 cm map of a city is not. For long horizons, use hierarchical A⋆ or JPS (Jump Point Search). Always inflate obstacles by the robot's radius before searching, or the path will clip corners.
Classic A⋆ assumes the map is fixed. But real robots constantly discover new obstacles as they drive. Replanning from scratch with A⋆ every time a new obstacle appears is wasteful, most of the map did not change. The D⋆ family of algorithms (Stentz, 1994; Koenig & Likhachev, 2002) solves this by doing incremental search: they re-use as much of the previous search tree as possible and only recompute the parts affected by the change.
global_planner package, the classic NASA Mars rover autonomy stack, and most modern ground robots use variants of D⋆ Lite for this reason.
Suppose we replace algorithmic search with physics. Imagine the goal as a valley that pulls the robot toward itself, and every obstacle as a hill that pushes it away. The robot simply rolls downhill, following the gradient of the total potential. This idea, introduced by Oussama Khatib in 1986, is called the artificial potential field method.
The attractive term is typically a quadratic well centered on the goal. The repulsive term is zero beyond some influence radius and grows rapidly as the robot nears an obstacle. At any position, the robot computes the local gradient and takes a small step against it. There is no search: the computation per step is essentially constant, which is why this method is attractive for real-time reactive control.
What if the goal is not to reach a point, but to visit every cell? Lawn mowers, robot vacuums, pool cleaners, and minesweeping drones all need coverage paths, a path that sweeps the entire free space. The simplest approach is boustrophedon (ox-plow) decomposition: divide free space into rectangular strips and trace a back-and-forth pattern through each. More sophisticated methods use spanning trees over the grid or solve the Chinese-Postman problem on the adjacency graph.
PSO is not specifically a path planner, it is a general optimization metaheuristic, but one that is often used to search for good paths. Inspired by bird flocking, it maintains a population of candidate solutions ("particles") that move through a search space, attracted toward both their own best-known position and the swarm's best-known position.
For path planning, each particle represents a candidate path (a sequence of waypoints). The "fitness" combines path length, smoothness, and clearance from obstacles. The swarm collectively descends toward low-cost paths. PSO is stochastic, handles non-convex cost landscapes gracefully, and parallelizes well, but provides no optimality guarantees and tuning its hyperparameters is more art than science.
A grid over a 2D room has a few thousand cells. A grid over the 7-dimensional configuration space of a robotic arm would have trillions. This is the curse of dimensionality, and it is why grid search breaks down for manipulators, humanoids, and cars with orientation. In 1995, Lydia Kavraki introduced the Probabilistic Roadmap (PRM), and in 1998 Steven LaValle introduced Rapidly-Exploring Random Trees (RRT). These methods sidestep the dimensionality problem by sampling configurations randomly, checking only those for collision, and connecting them into a graph.
The magic of sampling-based planners is that they are probabilistically complete: if a path exists, the probability of finding one approaches 1 as the number of samples grows. They trade exactness for tractability, and the trade is almost always worth it.
PRM is a multi-query planner: you build a reusable map once, then query it with many different start/goal pairs. The construction is two phases:
N random configurations. Discard any that lie in obstacles. What remains are nodes of a graph.k nearest neighbors. If the straight line is collision-free, add the edge.To answer a query, connect the start and goal to the roadmap, then run A⋆ on the graph. Because the roadmap is reused, the per-query cost is small, ideal for environments where many paths must be planned in the same world (e.g. warehouse robotics, robot-arm pick-and-place).
RRT answers a single query. Starting from the start configuration, it grows a tree by repeatedly (1) sampling a random point qrand, (2) finding the nearest tree node qnear, and (3) extending from qnear toward qrand by a small step, provided the step is collision-free. The beauty of this simple rule is that the tree preferentially explores unexplored regions: hence "rapidly-exploring."
RRT is almost embarrassingly well-suited to kinodynamic planning (where the robot has velocity/curvature constraints): instead of a straight line, the "extend" step can use any control input that respects the dynamics. This is why RRT is the workhorse of autonomous driving research, drone planning, and humanoid motion.
Vanilla RRT finds a path, but rarely the shortest one. RRT⋆ (Karaman & Frazzoli, 2011) adds a rewiring step: whenever a new node is added, the algorithm checks whether existing nearby nodes would have shorter paths if they were re-parented through the new node. This local improvement, done thousands of times, yields a tree that converges to the globally optimal path as samples grow. RRT⋆ is asymptotically optimal.
RRT⋆ with Reeds–Shepp paths. The straight-line "extend" step of vanilla RRT⋆ is illegal for a car, a car cannot turn on the spot. Replace the straight-line local planner with a Reeds–Shepp curve (the shortest path for a car that can go forward or backward with bounded turning radius) and the tree grows only executable motions. This is how most parking-lot planners work.
LQR-RRT⋆. For systems with complicated dynamics (drones, self-balancing robots, spacecraft), even Reeds–Shepp is not enough. LQR-RRT⋆ linearizes the system around each tree node and computes an LQR controller as the local planner. The "distance" between nodes becomes the LQR cost-to-go, which respects the actual dynamics. Result: a tree where every edge is an executable closed-loop trajectory.
State lattices. Instead of sampling continuously, a state lattice pre-computes a small library of dynamically-feasible motion primitives (e.g. for a car: "go straight 5 m", "swerve 30° left and continue 5 m", ...). Plan by choosing a sequence of primitives. The lattice turns continuous motion planning back into graph search (so you can run A⋆ on it) while respecting dynamics.
Biased polar sampling & lane sampling. Pure uniform sampling is wasteful in structured environments. On a road, we know the vehicle should stay near the lane center: sample laterally offset candidate trajectories (lane sampling) or, in polar coordinates from the current pose (biased polar sampling), with a distribution tilted toward the expected direction of travel. Both are staples of the DARPA Urban Challenge vehicles and their descendants.
All the methods so far plan a path as a sequence of positions. The Dynamic Window Approach (Fox, Burgard & Thrun, 1997) turns the problem on its head: plan a sequence of velocities instead. At each control cycle (typically 10–20 Hz), DWA asks a remarkably concrete question:
The dynamic window is the subset of velocity space the robot can actually reach in one control cycle, constrained by its current velocity and acceleration limits. DWA discretizes this window into a grid, simulates a short-horizon trajectory for each candidate (v, ω), rejects any that would collide, and scores the survivors by a weighted combination of:
The highest-scoring pair is commanded, and the whole procedure repeats next tick. This makes DWA a local reactive planner (myopic by design), but extremely fast and naturally respectful of the robot's actual dynamics. It pairs beautifully with a global planner that provides waypoints for the heading term.
The paths produced by A⋆ on a grid are jagged, sequences of 45° segments. A real vehicle cannot execute them without slamming on the brakes at each corner. Between the planner's output and the actuators, there is almost always a smoothing or trajectory generation stage that turns the geometric path into a time-parametrized trajectory respecting velocity, acceleration, and jerk limits.
To connect two states smoothly, we need continuity of position, velocity, and acceleration at both endpoints, six constraints total. A fifth-degree polynomial has exactly six coefficients, so it is the lowest-order polynomial that can satisfy all these boundary conditions. One per axis gives a smooth 2D (or 3D) trajectory:
A car with minimum turning radius R cannot move sideways. What is the shortest path between two arbitrary poses (position + heading) under this constraint? Dubins (1957) solved it for forward-only motion: the answer is always a concatenation of straight segments and arcs of radius R, with exactly six possible patterns (LSL, RSR, LSR, RSL, LRL, RLR).
Reeds and Shepp (1990) extended this to allow reverse motion, which is essential for parking. They showed there are exactly 48 word patterns (later reduced to 46). Given any two poses, you enumerate all 46, pick the shortest feasible one, and you have an optimal path.
The Linear-Quadratic Regulator is the optimal controller for a linear dynamical system with quadratic cost. Given dynamics ẋ = Ax + Bu and cost ∫(xTQx + uTRu)dt, the optimal control law is the famous linear feedback u = −Kx, where K is the solution of an algebraic Riccati equation.
For path planning, LQR provides a natural distance metric: the cost-to-go xTPx (where P is the Riccati solution) tells us how costly it is, in control effort, to drive the system from any state to a reference state. Coupling this with RRT⋆ (yielding LQR-RRT⋆) gives a planner whose notion of "near" is based on actual dynamics, not Euclidean geometry, essential for systems where the two diverge wildly (e.g. a flying drone cannot easily get "straight down" quickly).
A car on a road is not really free to go anywhere. It should stay near the lane center, respect a target speed, and overtake only when safe. Reasoning about these rules in raw (x, y) coordinates is awkward, a tight curve on a road means an acceptable path in (x, y) looks nothing like the lane. The solution, introduced to autonomous driving by Werling et al. (2010), is to plan in a Frenet frame attached to the road itself.
Along the road's center-line, we define two coordinates:
In this frame, a trajectory becomes two independent 1D problems: a longitudinal one s(t) governing speed, and a lateral one d(s) governing lane offset. We generate many candidate trajectories by sampling end-states (final offset, final speed, final time) and joining them with quintic polynomials. Each candidate is evaluated by a cost that penalises deviation from the lane center, harsh jerk, speed error, and proximity to other vehicles. The cheapest collision-free candidate wins, and it is re-planned a few times per second as the scene evolves.
No single planner dominates. The right choice depends on the structure of your problem: dimensionality, static vs. dynamic, known vs. unknown, and what you mean by "optimal." The table below is a condensed decision aid.
| Method | Family | Complete? | Optimal? | Dynamics? | Typical domain |
|---|---|---|---|---|---|
| Dijkstra | Grid search | ✓ on discretized grid | ✓ shortest in graph | ignored | road-network routing |
| A⋆ | Grid search | ✓ | ✓ if heuristic admissible | ignored | games, indoor robots, GPS nav |
| D⋆ Lite | Grid search (incremental) | ✓ | ✓ | ignored | field robots in unknown terrain |
| Potential Field | Reactive / local | ✗ (local minima) | ✗ | partial (smooth) | real-time reactive control, swarms |
| Coverage (boustrophedon) | Grid decomposition | ✓ covers free space | ✓ under length metric | ignored | floor cleaning, mowing, surveying |
| PSO | Stochastic optimization | ✗ (no guarantee) | ✗ (heuristic) | can include | messy cost landscapes |
| PRM | Sampling (multi-query) | probabilistic | probabilistic (PRM⋆) | extension needed | robot arms, static worlds w/ many queries |
| RRT | Sampling (single-query) | probabilistic | ✗ | ✓ (kinodynamic variants) | manipulators, first-solution problems |
| RRT⋆ | Sampling (single-query) | probabilistic | asymptotically optimal | ✓ with correct local planner | when any-time improvement matters |
| RRT⋆ + Reeds–Shepp | Sampling + curves | probabilistic | asymptotically optimal | ✓ car kinematics | parking, low-speed driving |
| LQR-RRT⋆ | Sampling + control | probabilistic | asymptotically optimal | ✓ arbitrary linearizable dynamics | drones, spacecraft, balancing robots |
| State Lattice | Discretized trajectories | resolution-complete | ✓ on lattice | ✓ by construction | on-road driving at medium speed |
| DWA | Reactive / local | ✗ (local) | local-optimal only | ✓ | indoor diff-drive, mobile bases |
| Quintic polynomial | Trajectory gen. | , | smooth by design | ✓ up to jerk | connecting waypoints, smoothing |
| Reeds–Shepp | Closed-form curves | ✓ analytical | ✓ shortest for car | ✓ car kinematics | parking, short-range driving |
| LQR planning | Optimal control | ✓ (locally) | ✓ quadratic cost | ✓ linearized | trajectory smoothing, reference tracking |
| Frenet trajectory opt. | Sampling on curves | ✓ within lane structure | within sampling set | ✓ dynamics-aware | highway & urban driving |
Frenet + quintic for trajectory, a behavior layer picking lane-change decisions, A⋆ or lane graphs for global routing. The Frenet formulation matches road structure perfectly.
Hybrid A⋆ on a grid enriched with car orientation, using Reeds–Shepp as the heuristic and final connector. Shipped by almost every OEM.
A⋆ for global route on a known map, DWA or TEB for local avoidance of humans & pallets. The ROS nav stack default.
D⋆ Lite for efficient replanning as new terrain is discovered. Pioneered on NASA's Mars Exploration Rovers and refined for Curiosity & Perseverance.
PRM or RRT-Connect in the arm's joint space. High dimensionality rules out grids; sampling dominates.
LQR-RRT⋆ or trajectory optimization (e.g. minimum-snap polynomials) for dynamic feasibility; MPC to track.
Boustrophedon coverage with online re-decomposition as the map is explored. Add a local reactive controller for pet avoidance.
A⋆ on a navigation mesh (nav-mesh). Hierarchical A⋆ for open-world maps. String-pulling for smooth post-processing.