Building

Building a wildfire simulator with neural cellular automata

May 29, 2026

NCA Fireline is a small browser sim where fire spreads across a grid and cells try to draw their own fireline. The constraint is the point: no cell can see the whole map.

Press Step in the simulator. Fire starts as heat in a few squares. Nearby cells read that heat, mix it with fuel, moisture, wind, and hidden memory, and write new numbers back into the grid. After enough ticks, yellow risk appears ahead of the flame. When the neural cell rule is on, cyan fireline appears where the risk is high.

That is a neural cellular automaton. It is not one network looking at the whole map. It is one small network copied into every cell.

The linked simulator uses hand-initialized weights so the code stays readable. The training section later shows how those weights would be learned.

The Smallest Version

A cellular automaton is a grid plus a rule.

Each square has a state. At every tick, the rule looks at nearby squares and decides what this square should become next. Then every square updates at once.

Conway’s Game of Life is the usual example. A square is either alive or dead. The rule counts living neighbors. Too few neighbors, the cell dies. Too many, it dies. Exactly the right number, it lives or is born.

The code shape is almost boring:

for each cell:
  read nearby cells
  compute this cell's next state

replace old grid with next grid
repeat

There is no object called a glider in the rule. A glider appears because many small neighbor-counting updates keep handing the pattern forward.

Animated cellular automaton grid showing one tick becoming the next by applying the same local rule.
A cellular automaton does not draw the final pattern directly. It repeats a local rule, and the pattern emerges from those repeated local updates.

For the wildfire version, the state is not alive or dead. It is heat, fuel, moisture, risk, and fireline. But the loop is the same: read nearby cells, compute the next local state, repeat.

The Local Window

The simulator uses a 3x3 neighborhood. For one cell, that means:

  • the cell itself,
  • the three cells above,
  • the two cells beside it,
  • the three cells below.

That is all the cell can observe.

It cannot look across the grid and say, “The fire front is shaped like a hook.” It cannot know that the right side of the forest is still safe. It cannot plan a perfect firebreak. The cell gets a local patch and has to make a local update.

Animated wildfire grid where a cyan 3 by 3 window moves over local fuel, heat, water, and hidden state.
The cyan box is the cell's world. Different locations produce different local inputs, but the rule that reads those inputs is shared everywhere.

This restriction is why cellular automata are interesting. If a fireline forms ahead of the flame, it cannot be because one master cell saw the whole board. It has to come from local agreement: many cells read similar danger signals and write similar protection signals.

A Cell Carries Channels

A normal pixel has color. A simple cellular automaton might have one state: alive or dead, tree or fire, empty or full.

A neural cellular automaton cell carries a vector. That just means the cell stores several numbers at the same grid location.

In NCA Fireline, each cell has these channels:

ChannelMeaning
fuelhow much burnable material remains
heathow strongly this cell is burning
moisturehow hard this cell is to ignite
burnedhow much has already burned
retardantwater or fireline protection
embershort heat memory
predictionlocal estimate of near-future risk
hiddenA, hiddenBscratchpad memory the renderer does not draw directly

The browser draws a composite view because I cannot show all those channels as one square without choosing colors. Green is fuel. Orange and red are heat. Yellow is risk. Cyan is retardant. Gray is burned ground.

But the rule is not reading “green” or “red.” It is reading numbers.

Animated cell expanding into a state vector with fuel, heat, moisture, risk, line, and hidden channels.
A rendered cell is one square, but the update rule sees a short vector of channel values. The drawing is only a view of those numbers.

This one change makes the system much richer. A cell can be hot and wet. It can be burned but still remember recent heat. It can have fuel and high risk before it is actually on fire.

The next question is how those numbers change.

The Rule Is A Tiny Network

The word “neural” can sound huge, but an NCA rule can be very small.

For one cell, the rule receives the 3x3 neighborhood. In this simulator, that means nine cells, each with several channels. The rule mixes those local numbers and writes small changes back into this cell’s channels.

A neural network layer is just a set of weighted sums followed by a simple nonlinearity. The weights are knobs:

nearby heat  * positive weight  -> raises danger
wind heat    * positive weight  -> raises danger
fuel         * positive weight  -> raises danger
moisture     * negative weight  -> lowers danger
fireline     * negative weight  -> lowers danger
hidden trace * learned weight   -> depends on what training found useful

The nonlinearity keeps the result in a useful range. sigmoid squeezes a number between 0 and 1, which works well for a risk channel. tanh squeezes a number between -1 and 1, which works well for hidden signals that can push in either direction.

Animated diagram where local heat, wind, fuel, moisture, and memory flow through a tiny shared network to heat, risk, and line outputs.
The tiny network is copied into every cell. Different cells behave differently because their local inputs are different, not because they have different rules.

In a trained NCA, those weights are learned. In the browser version, the tiny network is present in the code, but the weights are chosen by hand so the mechanism can be read directly. The shape is still the same: local inputs in, mixed signals in the middle, channel changes out.

One output head can express a pattern like this:

risk goes up when:
  heat is nearby
  wind pushes heat toward this cell
  fuel remains
  hidden memory says this area has been active

risk goes down when:
  the cell is wet
  the cell has fireline
  the cell is already burned

That is not symbolic reasoning. The network is not thinking in sentences. It is turning local measurements into new numbers.

Repetition Moves Information

One update only moves information one neighborhood away.

If cell A is burning, the cells beside it can notice heat. On the next tick, those cells may heat up. Then their neighbors can notice. The fire front moves because the same local rule keeps running.

Animated sequence of wildfire grids showing heat moving across the grid by repeated local updates.
No signal jumps across the board. Each tick hands information to nearby cells, and many ticks make that handoff look like motion.

This is also why NCA can be hard to reason about from one line of code. The rule may be small, but it is applied many times. A tiny change in one channel can become visible after twenty or fifty updates.

Hidden Memory

Some channels are drawn. Some channels are not.

Hidden channels are scratchpad memory. They are numbers that live in the cell state and get copied forward, but the renderer does not need to show them as fuel or fire.

That matters because a purely instant rule can be jittery. It can only react to what is visible right now. A hidden channel lets the cell carry a trace: “heat was near me recently,” or “risk has been rising here,” or “this strip has started to become protected.”

Animated side-by-side grids showing visible heat without hidden trace and a version with purple hidden memory behind the heat front.
The purple trace is not fire. It is hidden state: memory that the next local update can read.

This is where the cell begins to feel less like a pixel and more like a tiny state machine. It still only sees locally, but it carries a small private history.

Prediction And Control

Prediction in this simulator is not a separate model. It is one channel in the cell vector.

Each tick, the rule writes a risk value into the prediction channel. A high risk value means: based on my local neighborhood, this cell is likely to become dangerous soon.

Once risk is a channel, control is just another local update:

if risk is high
and fuel remains
and this cell is not already protected:
  add some retardant

That writes into the retardant channel. When many cells do it, a cyan fireline appears ahead of the flame.

Animated comparison where a risk channel appears ahead of a flame and then high-risk cells add cyan fireline.
The yellow band is local prediction. The cyan band is local action. The action is not planned globally; it is written by cells whose own risk is high.

This is the game-like part of the project. You can paint heat, water, fuel, and line by hand. You can also turn on the neural cell rule and let local risk produce local protection.

Training Is The Important Part

The browser sim uses a tiny network, but the weights are hand-initialized. That makes the code easier to read, but it skips the most important question:

How would the network learn the rule instead of me writing it?

Training an NCA does not change the deployment story. A trained cell still only sees its local neighborhood. It still runs the same shared rule. It still writes channels back into the grid.

Training changes how we find the weights inside the rule.

Instead of me choosing constants like “nearby heat matters a lot” and “moisture should lower risk,” training starts with bad weights, runs the simulation, scores the result, and nudges the weights in the direction that would have produced a better result.

The trainable pieces are just the weight tables inside the local network:

3x3 local channels
  -> input weights
  -> hidden signals
  -> output weights
  -> next channels

At first those numbers are almost random. The rollout looks bad because the cell rule has not learned what heat, fuel, wind, memory, and fireline should mean together. Training is the loop that turns those anonymous numbers into a useful local habit.

The diagrams in this training section are conceptual. The playground linked above does not train in the browser; it uses fixed, hand-initialized weights so the local rule stays small enough to read.

That loop has five pieces.

1. Choose A Goal

Training needs a target. The target does not have to say what every cell should do at every tick. It can describe the final behavior we want.

For this wildfire simulator, a simple goal could be:

  • keep heat from crossing a boundary,
  • leave as much fuel alive as possible,
  • use fireline only where it helps,
  • avoid flickering or unstable channel values.

The target is not a secret map handed to each cell while the sim runs. It is only used by the training process to score the rollout.

Animated training target showing a start fire on the left and a desired result with a cyan boundary and protected fuel on the right.
A training goal says what a good rollout looks like. It does not give each cell global vision during the rollout.

The trained rule can learn from global feedback during training. When it runs, each cell still gets only local observations.

2. Unroll Time

To train the rule, we run it for many ticks.

That is called unrolling. If the rule runs for 64 ticks, the training system keeps the chain:

state 0 -> state 1 -> state 2 -> ... -> state 64

Each arrow uses the same shared rule. The weights are reused at every tick and at every cell.

Animated filmstrip of NCA states saved across several ticks while the same shared rule runs repeatedly.
Unrolling records the sequence of states. Training needs that sequence so it can trace how early local changes affected the final score.

This is why NCA training can feel different from training a normal image classifier. The rule is not used once. It is used again and again, and the final behavior depends on the whole chain.

3. Calculate Loss

Loss is the number training tries to reduce.

If the final state is bad, the loss is high. If the final state is good, the loss is low.

For this wildfire version, a loss could be a weighted sum:

loss =
  heat that crossed the boundary
  + fuel that burned unnecessarily
  + fireline placed far away from risk
  + unstable flicker in hidden channels

The exact loss function is a design choice. If I punish fireline too strongly, the model may become timid and let fire through. If I punish burned fuel too strongly, it may paint too much line everywhere. Training does not remove the need to choose what “good” means.

Animated final wildfire grid compared to a cyan boundary with loss bars for crossed fire, wasted line, and burned fuel.
The loss turns a messy rollout into a score. The score is how the training process knows which direction is better.

This framing helped neural network training feel less abstract to me. The network does not start by knowing the rule. We define the score, then let many small updates search for weights that score better.

4. Backpropagate Through The Rollout

Backpropagation is bookkeeping for blame.

After the loss is computed at the end, backpropagation walks backward through the saved rollout and asks:

  • which weights made the loss higher,
  • which weights made the loss lower,
  • how should each weight move a tiny amount?

Because the same rule is used at every cell and every tick, the gradients from many places add together. A gradient says how the loss changes when a weight moves. Gradient descent moves the weight the other way, toward lower loss. One weight might affect ignition near the left edge, prediction near the center, and fireline near the right edge. Training collects all of that pressure into one update for the shared rule.

Animated timeline where loss sends backward arrows through saved ticks to nudge shared NCA weights.
Backprop follows the chain backward. The cell rule stays local, but the training algorithm can use the whole saved rollout to adjust the shared weights.

This is the main reason training NCA is powerful. I do not have to decide by hand exactly how much wind heat should matter, or exactly how hidden memory should affect risk. If the loss is well-designed and the model is trainable, gradient descent can search those weights.

5. Update The Shared Rule

After backprop computes gradients, an optimizer changes the weights a little. An optimizer is the update recipe. It decides how large each nudge should be after the gradients point in a direction.

Then the whole process repeats:

start with random forest and wind
run the shared rule for many ticks
score the result
backpropagate through the ticks
update the weights
try again

The first rollouts are usually bad. Fire crosses the boundary. Prediction is late. Fireline appears in useless places. After many updates, the rule can start to discover useful local habits:

  • risk should rise before heat arrives,
  • fireline should appear near future danger,
  • wet cells should resist ignition,
  • hidden memory should smooth out local decisions,
  • the rule should work across different wind and fuel patterns.
Animated before and after training comparison where early weights let fire cross the line and trained weights contain the fire with cyan fireline.
Training does not store one answer. It improves the local rule, then the improved rule is copied into every cell.

The trained weights are still just numbers in the tiny network. The cell still does not know the map. The global behavior comes from running the improved local rule many times.

Why Training Does Not Break Locality

This was the part that took me longest to understand.

During training, the loss can look at the whole grid. It can say, “fire crossed the boundary over there,” or “too much fuel burned overall.” That sounds global.

But the rule being trained is still local. The model does not learn a lookup table for one map. It learns weights for a function that each cell can run from its own neighborhood.

So there are two different views:

PhaseWhat can see the whole grid?What the cell sees
Trainingthe loss function and backpropagationstill a local 3x3 neighborhood during each tick
Running the simnothing central has to actthe same local 3x3 neighborhood

That is the trick. Global feedback trains a local rule. Then the local rule creates behavior that looks coordinated.

The Playground Version

The playground keeps the mechanics visible.

The main field draws the composite state. The tabs let me inspect individual views: fuel, heat, risk, and line. The small channel previews show that the screen is not one image. It is several fields layered together.

Animated browser playground where a wildfire spreads through fuel while yellow risk and cyan fireline channels appear.
The app is a way to poke the state vector directly. Paint heat, cool cells, lay line, change weather, then watch the local rule respond.

I like the playground most when it is paused.

Switch to risk and press Step a few times. Yellow appears before the orange front reaches it. Turn on the neural cell rule and step again. Cyan starts to appear in the same dangerous strip. Hover one cell there and the picture breaks back down into numbers: fuel, heat, risk, line, and memory.

That paused moment is the part I wanted. The picture turns back into one cell, one neighborhood, and a few numbers changing in order. The rest of the forest is still there, but I no longer have to explain it from above.