Skip to content

Compute Backends

ApexSim separates physical modeling from numerical execution backends. This page helps you choose the right backend for your study.

Shared Core Architecture

Solver backends intentionally share one algorithmic core where possible:

  • Quasi-static speed profile:
  • shared core: src/apexsim/simulation/_profile_core.py
  • numpy adapter: src/apexsim/simulation/profile.py
  • torch adapter: src/apexsim/simulation/torch_profile.py
  • numba keeps specialized kernels in src/apexsim/simulation/numba_profile.py with parity checks against shared semantics.
  • Transient PID infrastructure:
  • shared helpers: src/apexsim/simulation/_transient_pid_core.py
  • shared control mesh/encoding: src/apexsim/simulation/_transient_controls_core.py
  • shared progress reporting: src/apexsim/simulation/_progress.py
  • Vehicle backend physics primitives:
  • shared formulas: src/apexsim/vehicle/_backend_physics_core.py
  • consumed by point-mass and single-track backend adapters.

This reduces backend-specific formula drift while preserving specialized performance paths (especially numba kernels).

Backend policy

Supported backends are intentionally restricted to:

  • numpy: CPU-only reference backend.
  • numba: CPU-only JIT-accelerated backend.
  • torch: CPU and GPU backend (device via torch_device).

Solver modes:

  • quasi_static (default)
  • transient_oc

The runtime validator enforces this policy:

  • numpy and numba require torch_device="cpu".
  • compute_backend="torch" currently requires torch_compile=False so the solver path remains AD-compatible by default.
  • solver_mode="transient_oc" with driver_model="optimal_control" requires backend-specific extras:
  • numpy and numba: scipy
  • torch: torchdiffeq
  • TransientNumericsConfig.pid_gain_scheduling_mode is supported in PID mode on all three backends (numpy, numba, torch).

Current model support

Backend support is model-dependent:

  • PointMassModel: supports numpy, numba, torch.
  • SingleTrackModel: supports numpy, numba, torch.

Terminology note:

  • SingleTrackModel corresponds to the "bicycle model" terminology often used in literature.

If you request a backend that a model does not implement, the solver raises a clear ConfigurationError describing the missing model-side methods.

Decision guide

Use this practical rule-set:

  1. Use numpy when you need robust baseline behavior and easiest debugging.
  2. Use numba for large CPU parameter sweeps with PointMassModel or SingleTrackModel.
  3. Use torch when you need tensor-native workflows, GPU execution, or AD-first optimization workflows.

Trade-offs at a glance

Backend Hardware Typical strength Typical cost
numpy CPU Stable baseline, easy to inspect Slower for huge batch studies
numba CPU Very fast steady-state loops after JIT warmup First call includes compile overhead
torch CPU/GPU Backend portability, tensor ecosystem Higher overhead for single-lap CPU workloads

Transient dependency note

The transient solver dependencies are included in the default install:

  • scipy for NumPy/Numba transient optimal-control paths
  • torchdiffeq for Torch transient optimal-control ODE integration

Configuration examples

NumPy (CPU reference)

from apexsim.simulation import build_simulation_config

config = build_simulation_config(
    compute_backend="numpy",
    max_speed=115.0,
)

Numba (CPU-optimized)

from apexsim.simulation import build_simulation_config

config = build_simulation_config(
    compute_backend="numba",
    max_speed=115.0,
)

Torch (CPU or GPU)

from apexsim.simulation import build_simulation_config

# CPU
config_cpu = build_simulation_config(
    compute_backend="torch",
    torch_device="cpu",
    torch_compile=False,
)

# GPU
config_gpu = build_simulation_config(
    compute_backend="torch",
    torch_device="cuda:0",
    torch_compile=False,
)

Benchmark methodology

Reference scripts:

python examples/backend_benchmarks.py --warmup-runs 5 --timed-runs 20
python scripts/benchmark_solver_matrix.py --warmup-runs 2 --timed-runs 5 --output baseline.json
python scripts/benchmark_solver_matrix.py --warmup-runs 2 --timed-runs 5 --output candidate.json
python scripts/compare_solver_benchmarks.py --baseline baseline.json --candidate candidate.json --max-slowdown-pct 5 --require-same-cases

Notes:

  • Benchmarks run full Spa point-mass laps (data/spa_francorchamps.csv).
  • "First Call" includes startup/JIT/compile effects.
  • "Steady" values are from repeated post-warmup runs.
  • Use your own machine data for final backend decisions.
  • The solver matrix script covers:
  • models: PointMassModel, SingleTrackModel
  • solver modes: quasi_static, transient_oc (PID)
  • tracks: straight, circle, plus Spa smoke (unless --skip-spa).
  • compare_solver_benchmarks.py is intended as a PR gate for the 5%-slowdown policy.

Benchmark snapshot (February 17, 2026)

Environment used for this snapshot:

  • CPU: Intel Core i7-8550U (4C/8T)
  • numpy==2.3.5
  • numba==0.63.1
  • torch==2.10.0+cpu
  • CUDA unavailable in this run
Backend First Call [ms] Steady Mean [ms] Steady Median [ms] Lap Time [s]
numpy 19.72 14.72 14.73 133.668234
numba 1303.90 0.69 0.68 133.668234
torch (cpu) 323.99 342.27 337.44 133.668234

Interpretation of this snapshot:

  • numba steady-state is about 21x faster than numpy after JIT warmup.
  • torch on CPU is slower than numpy for single-lap workflows in this setup.
  • Identical lap times across backends confirm numerical consistency for this case.

Reproducibility tips

  • Run each benchmark at least twice and compare medians.
  • Avoid other heavy processes during timing runs.
  • For GPU evaluation, include torch_device="cuda:0" runs in addition to CPU runs.
  • Save your benchmark JSON using --output and keep it with your project notes.