From Examples to Experiments

Reproducible workflows in GeoClaw

Kyle Mandli

2026-03-20

How we actually use GeoClaw

Where this gets painful

  • Hard to reproduce results later
  • Hard to compare runs systematically
  • Easy to break something silently

What we actually want

  • Re-run the same setup reliably
  • Compare runs automatically
  • Scale from 1 → many experiments

A better workflow

A simple idea

Treat simulation runs as repeatable checks

  • Define a setup
  • Run it
  • Compare to a reference
  • Let pytest organize the loop

A GeoClaw test in practice

runner = test.GeoClawTestRunner(tmp_path, test_path=Path(__file__).parent)
runner.set_data()
runner.write_data()

runner.build_executable()
runner.run_code()
  • same GeoClaw workflow
  • now repeatable and reusable

Example: bowl-slosh

runner.rundata.gaugedata.gauges.append([1, 0.5, 0.5, 0, 1e10])
...
runner.check_gauge(save=save, gauge_id=1, indices=(2, 3))
runner.check_fgmax(save=save)
  • generated topography
  • one gauge
  • one fgmax grid
  • compares outputs you actually care about

Parameterization: Hurricane Ike

CASES = [
    pytest.param({"num_cells": [29, 24], "amr_levels_max": 2,
                  "num_output_times": 1}, id="coarse"),
    pytest.param({"num_cells": [116, 96], "amr_levels_max": 6,
                  "num_output_times": 16}, id="fine",
                  marks=pytest.mark.slow),
]
  • same physical problem
  • different grids / AMR settings
  • fast default + slower, higher-fidelity run

This is for users too

This is not about forcing everyone into developer-style tests.

  • It helps you rerun cases cleanly
  • It helps you compare outputs you already care about
  • It gives you a path from one example to many related experiments

From tests to workflows

  • synthetic topography and forcing
  • controlled scenarios in surge-examples/synthetic
  • larger workflow automation in surge-examples/auto_workflow
  • same structure, just scaled up

Takeaways

  • Start from an example you already trust
  • Make the run / compare loop repeatable
  • Parameterize one thing first
  • Grow into a workflow only when it helps

Pointers