Compute Backends¶

ApexSim separates physical modeling from numerical execution backends. This page helps you choose the right backend for your study.

Shared Core Architecture¶

Solver backends intentionally share one algorithmic core where possible:

Quasi-static speed profile:
shared core: src/apexsim/simulation/_profile_core.py
numpy adapter: src/apexsim/simulation/profile.py
torch adapter: src/apexsim/simulation/torch_profile.py
numba keeps specialized kernels in src/apexsim/simulation/numba_profile.py with parity checks against shared semantics.
Transient PID infrastructure:
shared helpers: src/apexsim/simulation/_transient_pid_core.py
shared control mesh/encoding: src/apexsim/simulation/_transient_controls_core.py
shared progress reporting: src/apexsim/simulation/_progress.py
Vehicle backend physics primitives:
shared formulas: src/apexsim/vehicle/_backend_physics_core.py
consumed by point-mass and single-track backend adapters.

This reduces backend-specific formula drift while preserving specialized performance paths (especially numba kernels).

Backend policy¶

Supported backends are intentionally restricted to:

numpy: CPU-only reference backend.
numba: CPU-only JIT-accelerated backend.
torch: CPU and GPU backend (device via torch_device).

Solver modes:

quasi_static (default)
transient_oc

The runtime validator enforces this policy:

numpy and numba require torch_device="cpu".
compute_backend="torch" currently requires torch_compile=False so the solver path remains AD-compatible by default.
solver_mode="transient_oc" with driver_model="optimal_control" requires backend-specific extras:
numpy and numba: scipy
torch: torchdiffeq
TransientNumericsConfig.pid_gain_scheduling_mode is supported in PID mode on all three backends (numpy, numba, torch).

Current model support¶

Backend support is model-dependent:

PointMassModel: supports numpy, numba, torch.
SingleTrackModel: supports numpy, numba, torch.

Terminology note:

SingleTrackModel corresponds to the "bicycle model" terminology often used in literature.

If you request a backend that a model does not implement, the solver raises a clear ConfigurationError describing the missing model-side methods.

Decision guide¶

Use this practical rule-set:

Use numpy when you need robust baseline behavior and easiest debugging.
Use numba for large CPU parameter sweeps with PointMassModel or SingleTrackModel.
Use torch when you need tensor-native workflows, GPU execution, or AD-first optimization workflows.

Trade-offs at a glance¶

Backend	Hardware	Typical strength	Typical cost
`numpy`	CPU	Stable baseline, easy to inspect	Slower for huge batch studies
`numba`	CPU	Very fast steady-state loops after JIT warmup	First call includes compile overhead
`torch`	CPU/GPU	Backend portability, tensor ecosystem	Higher overhead for single-lap CPU workloads

Transient dependency note¶

The transient solver dependencies are included in the default install:

scipy for NumPy/Numba transient optimal-control paths
torchdiffeq for Torch transient optimal-control ODE integration

Configuration examples¶

NumPy (CPU reference)¶

from apexsim.simulation import build_simulation_config

config = build_simulation_config(
    compute_backend="numpy",
    max_speed=115.0,
)

Numba (CPU-optimized)¶

from apexsim.simulation import build_simulation_config

config = build_simulation_config(
    compute_backend="numba",
    max_speed=115.0,
)

Torch (CPU or GPU)¶

from apexsim.simulation import build_simulation_config

# CPU
config_cpu = build_simulation_config(
    compute_backend="torch",
    torch_device="cpu",
    torch_compile=False,
)

# GPU
config_gpu = build_simulation_config(
    compute_backend="torch",
    torch_device="cuda:0",
    torch_compile=False,
)

Benchmark methodology¶

Reference scripts:

python examples/backend_benchmarks.py --warmup-runs 5 --timed-runs 20
python scripts/benchmark_solver_matrix.py --warmup-runs 2 --timed-runs 5 --output baseline.json
python scripts/benchmark_solver_matrix.py --warmup-runs 2 --timed-runs 5 --output candidate.json
python scripts/compare_solver_benchmarks.py --baseline baseline.json --candidate candidate.json --max-slowdown-pct 5 --require-same-cases

Notes:

Benchmarks run full Spa point-mass laps (data/spa_francorchamps.csv).
"First Call" includes startup/JIT/compile effects.
"Steady" values are from repeated post-warmup runs.
Use your own machine data for final backend decisions.
The solver matrix script covers:
models: PointMassModel, SingleTrackModel
solver modes: quasi_static, transient_oc (PID)
tracks: straight, circle, plus Spa smoke (unless --skip-spa).
compare_solver_benchmarks.py is intended as a PR gate for the 5%-slowdown policy.

Benchmark snapshot (February 17, 2026)¶

Environment used for this snapshot:

CPU: Intel Core i7-8550U (4C/8T)
numpy==2.3.5
numba==0.63.1
torch==2.10.0+cpu
CUDA unavailable in this run

Backend	First Call [ms]	Steady Mean [ms]	Steady Median [ms]	Lap Time [s]
`numpy`	19.72	14.72	14.73	133.668234
`numba`	1303.90	0.69	0.68	133.668234
`torch` (`cpu`)	323.99	342.27	337.44	133.668234

Interpretation of this snapshot:

numba steady-state is about 21x faster than numpy after JIT warmup.
torch on CPU is slower than numpy for single-lap workflows in this setup.
Identical lap times across backends confirm numerical consistency for this case.

Reproducibility tips¶

Run each benchmark at least twice and compare medians.
Avoid other heavy processes during timing runs.
For GPU evaluation, include torch_device="cuda:0" runs in addition to CPU runs.
Save your benchmark JSON using --output and keep it with your project notes.