Solver Backends
Sentinel uses a pluggable linear solver system built on the AbstractLinearSolver interface. All solvers implement the same linear_solve! method, so switching between backends requires only changing the solver argument — no changes to problem setup or assembly code.
AbstractLinearSolver Interface
Every solver backend subtypes AbstractLinearSolver and implements:
linear_solve!(x::AbstractVector{ComplexF64}, K::SparseMatrixCSC{ComplexF64},
f::AbstractVector{ComplexF64}, solver::YourSolver) -> xThe system K * x = f is solved in-place, overwriting x. Here K is always a complex-valued sparse matrix (complex symmetric for viscoelastic models, general for poroelastic model 4) with Dirichlet boundary conditions already applied.
Solvers that cache internal state (e.g., factorizations) should also implement:
invalidate_cache!(solver::YourSolver)This discards cached state when K changes (e.g., after re-assembly). The default fallback is a no-op.
See the Forward Solver API page for full docstrings of AbstractLinearSolver, linear_solve!, and invalidate_cache!.
DirectSolver
The default solver. Uses Julia's built-in \ operator, which dispatches to SuiteSparse UMFPACK for complex sparse matrices. No extra dependencies required.
solver = DirectSolver()
forward_solve!(disp, K, f, model, 1; solver=solver)Best for development, testing, and small to moderate problem sizes. Every call performs a fresh LU factorization.
See the Forward Solver API page for the full DirectSolver docstring.
CachedDirectSolver
Wraps UMFPACK with LU factorization caching. On the first linear_solve! call, computes lu(K) and stores it. Subsequent calls with the same K reuse the cached factorization, solving only by forward/back substitution.
solver = CachedDirectSolver()
forward_solve!(disp, K, f, model, 1; solver=solver) # ~55ms (factor + solve)
adjoint_solve!(adjnt, K, calc, meas, bcs, 1; solver=solver) # ~0.2ms (reuse)This is the primary solver for the inverse problem loop. The forward solve stores lu(K), and the adjoint solve reuses it — since both operate on the same stiffness matrix, the expensive factorization is performed only once per assembly.
Call invalidate_cache! whenever K is re-assembled. The evaluate_objective! function calls invalidate_cache! automatically after each assembly, so in normal usage the cache lifecycle is managed for you.
See the Forward Solver API page for the full CachedDirectSolver docstring.
MUMPSSolver
MUMPS (MUltifrontal Massively Parallel Sparse direct Solver) backend, provided by the SentinelMUMPSExt package extension. Loaded automatically when using MUMPS is called alongside Sentinel.
using MUMPS
solver = MUMPSSolver(sym=2) # complex symmetric (default)
forward_solve!(disp, K, f, model, 1; solver=solver)The sym parameter controls the MUMPS symmetry mode:
sym | Mode | Factorization | Use case |
|---|---|---|---|
2 | Complex symmetric | LDL^T | Viscoelastic models 1, 2, 3, 5, 6, 7 (default) |
0 | General unsymmetric | LU | Poroelastic model 4 |
Requires MPI to be initialized before use (MPI.Init() is called automatically if needed).
See the Forward Solver API page for the full MUMPSSolver docstring.
KrylovSolver
GMRES iterative solver with ILU(0) preconditioning, using Krylov.jl and IncompleteLU.jl. The ILU preconditioner is built lazily on the first solve and cached for reuse across subsequent solves with the same sparsity pattern.
solver = KrylovSolver(rtol=1e-12, atol=1e-12, memory=20, itmax=200)
forward_solve!(disp, K, f, model, 1; solver=solver)Parameters:
| Parameter | Default | Description |
|---|---|---|
rtol | 1e-12 | Relative convergence tolerance |
atol | 1e-12 | Absolute convergence tolerance |
memory | 20 | GMRES restart parameter (Krylov subspace size) |
itmax | 200 | Maximum GMRES iterations |
verbose | 0 | Verbosity level (0 = silent) |
Validation: All 276 zones in the MGH brain MRE dataset converge in 1 GMRES iteration with ILU(0) preconditioning, with maximum error of 1.91e-13 relative to the direct solver. This indicates that ILU(0) is an excellent preconditioner for the MRE stiffness matrices.
CUDAKrylovSolver
GPU-accelerated GMRES solver using cuSPARSE for sparse matrix-vector products and ILU(0) preconditioning. Provided by the SentinelCUDAExt package extension.
using CUDA
solver = CUDAKrylovSolver(rtol=1e-12, atol=1e-12, memory=20)
forward_solve!(disp, K, f, model, 1; solver=solver)The solver transfers K to the GPU, computes ILU(0) via cuSPARSE csrilu02, and runs GMRES with GPU-resident vectors. For zone-sized systems (~1700 DOFs), ILU(0) achieves 1-iteration convergence, making the effective cost one SpMV plus two triangular solves.
Note
Despite excellent convergence, CUDAKrylovSolver is slower than CUDSSBatchedSolver for zone-sized problems due to GPU kernel launch overhead. Use CUDSSBatchedSolver for production workloads.
CUDSSDirectSolver
GPU sparse direct solver via NVIDIA cuDSS. Performs per-zone factorization and solve on a dedicated CUDA stream for thread-safe concurrent operation.
using CUDA, CUDSS
solver = CUDSSDirectSolver() # allocates its own CuStream
forward_solve!(disp, K, f, model, 1; solver=solver)Key features:
Per-solver CUDA stream — Each instance owns a
CuStream, enabling safe concurrent use from multiple Julia threads during line search.Refactorization reuse — First call performs full analysis + factorization. Subsequent calls with
_needs_refactor=trueonly refactorize (same sparsity pattern). Same-K different-RHS calls (adjoint reuse) only update the RHS.Cache invalidation — Call
invalidate_cache!(solver)after material updates to trigger refactorization on the next solve.
Used in Phase 3 (line search) of the batched zone driver, where ~290 solvers run concurrently across Julia threads. Achieves ~8.5ms per refactor+solve on DGX Spark ARM cores, vs ~55ms for CPU LU on the same hardware.
CUDSSBatchedSolver
Batched GPU factorization and solve for all zones simultaneously. This is the primary production solver for Phase 1 (forward + adjoint) of the GPU zone decomposition driver.
using CUDA, CUDSS
bsolver = CUDSSBatchedSolver(structure="G")
create_batched_solver!(bsolver, zone_Ks) # symbolic analysis + factorize
batched_solve!(bsolver, zone_xs, zone_rhs) # solve all zones
batched_update_and_refactor!(bsolver, zone_Ks) # update values + refactorizeKey functions:
create_batched_solver!(solver, Ks)— Transfers all zone K matrices to GPU, creates per-zoneCudssSolverinstances, performs symbolic analysis and initial factorization.batched_solve!(solver, xs, rhs)— Solves all zone systems using cached factorizations. RHS vectors are transferred to GPU, solved, and results read back.batched_update_and_refactor!(solver, Ks)— Updates GPU-side matrix values from CPU and re-factorizes, reusing the symbolic analysis fromcreate_batched_solver!.
Handles variable-size zones (different sparsity patterns and dimensions) by maintaining independent CudssSolver instances per zone. On the MGH brain dataset (~290 zones), factorizes all zones in ~5s and solves in ~48ms.
Adding a New Solver
To add a new solver backend:
1. Define the solver type:
struct MySolver <: AbstractLinearSolver
# solver-specific configuration fields
end2. Implement linear_solve!:
function linear_solve!(x::AbstractVector{ComplexF64},
K::SparseMatrixCSC{ComplexF64},
f::AbstractVector{ComplexF64},
solver::MySolver)
# Solve K * x = f, store result in x
return x
end3. (Optional) Implement invalidate_cache!:
function invalidate_cache!(solver::MySolver)
solver.cached_factor = nothing
return nothing
end4. Export from Sentinel:
Add the type to the export list in src/forward/solver.jl:
export MySolverThe solver can then be passed to forward_solve!, adjoint_solve!, or set as the solver field in a ForwardProblemContext.
If the solver is provided by an external package, implement it as a package extension (see ext/SentinelMUMPSExt.jl for the pattern).