Author: Richard A. Feiss IV, Ph.D. Version: 2.0.0 License: MIT Institution: Minnesota Center for Prion Research and Outreach (MNPRO), University of Minnesota GitHub: https://github.com/RFeissIV/GALAHAD
GALAHAD is a gradient-based optimizer for smooth objectives over mixed-geometry parameter spaces — problems where some parameters must be positive (rates, concentrations, Hill coefficients) and others are unconstrained (location parameters, regression coefficients, log-EC50).
Version 2 replaces the hard-clamp positivity enforcement of v1 with a softplus reparameterization, ensuring that positivity constraints are handled analytically and that gradients remain well-defined at the constraint boundary. It also adds rho-based trust-region adaptation, relative function-stall detection, and a richer per-iteration diagnostic history.
| Feature | v1 | v2 |
|---|---|---|
| Positivity constraint | Hard clamp to eps_safe |
Softplus reparameterization |
| Parameter partition API | {T, P, E} (mandatory) |
{positive, euclidean} (preferred) |
| Trust-region adaptation | Binary accept/reject | rho (actual/predicted reduction) |
| Stall detection | Absolute only | Absolute + relative (tol_f_rel) |
| History columns | 8 (no rho, armijo count) | 10 (adds armijo_iters, pred_red,
rho) |
| Halpern averaging | Yes (extra eval/iter) | Removed |
| Lyapunov tracking | Yes (explicit violations) | Removed (rho history equivalent) |
| Exported helpers | None | galahad_numgrad(), galahad_parts() |
# install.packages("GALAHAD")
library(GALAHAD)
# Fit an exponential decay: y = A * exp(-k * t)
# Both A > 0 and k > 0
set.seed(1)
t <- seq(0, 5, by = 0.5)
y <- 3 * exp(-0.8 * t) + rnorm(length(t), sd = 0.05)
obj <- function(theta) sum((y - theta[1] * exp(-theta[2] * t))^2)
grd <- function(theta) {
r <- y - theta[1] * exp(-theta[2] * t)
c(-2 * sum(r * exp(-theta[2] * t)),
-2 * sum(r * (-t) * theta[1] * exp(-theta[2] * t)))
}
fit <- GALAHAD(
V = obj,
gradV = grd,
theta0 = c(A = 2, k = 0.5),
parts = list(positive = c(1L, 2L), euclidean = integer(0))
)
fit$theta # ~ c(3, 0.8)
fit$converged # TRUE
fit$reason # "GRAD_TOL" or "FUNC_STALL_*"Use the built-in finite-difference helper:
grd_num <- function(theta) galahad_numgrad(obj, theta)
fit <- GALAHAD(obj, grd_num, theta0 = c(2, 0.5),
parts = list(positive = c(1L, 2L), euclidean = integer(0)))The calling convention is identical. The only change that may require code edits:
theta0[positive] must be strictly
> 0. v1 silently clamped zero or negative
starting values; v2 raises an informative error. Fix by ensuring
starting values are positive.
parts form — either update to
{positive, euclidean} or leave as {T, P, E}
(both still work; T and P indices both map to
positive).
History schema — if your code accesses
fit$history by column name, note that
delta_V_rel and lyapunov_ok are removed;
armijo_iters, pred_red, and rho
are added.
| Component | Description |
|---|---|
| Softplus reparameterization | θᵢ = log(1 + eᶻⁱ); gradient corrected by sigmoid chain rule |
| Diagonal preconditioning | Curvature estimate via secant EMA: 0.8 L + 0.2 |y/s| |
| Step-size selection | Polyak → Barzilai-Borwein (BB2) → constant default |
| Armijo backtracking | Sufficient decrease with configurable max backtrack steps |
| Trust-region projection | Scaled M-norm; radius adapted by rho ratio |
| Convergence | Gradient + step tolerance; absolute + relative function stall |
Development followed an iterative human-machine collaboration where all algorithmic design, statistical methodologies, and biological validation logic were conceptualized, tested, and iteratively refined by Richard A. Feiss through repeated cycles of running experimental data, evaluating analytical outputs, and selecting among candidate algorithms and approaches. AI systems (‘Anthropic Claude’ and ‘OpenAI GPT’) served as coding assistants and analytical sounding boards under continuous human direction. The selection of statistical methods, evaluation of biological plausibility, and all final methodology decisions were made by the human author. AI systems did not independently originate algorithms, statistical approaches, or scientific methodologies.
Barzilai, J., & Borwein, J. M. (1988). Two-point step size gradient methods. IMA Journal of Numerical Analysis, 8(1), 141–148. https://doi.org/10.1093/imanum/8.1.141
Conn, A. R., Gould, N. I. M., & Toint, P. L. (2000). Trust-Region Methods. SIAM. https://doi.org/10.1137/1.9780898719857
Dugas, C., Bengio, Y., Belisle, F., Nadeau, C., & Garcia, R. (2009). Incorporating functional knowledge in neural networks. Journal of Machine Learning Research, 10(42), 1239–1262. https://www.jmlr.org/papers/v10/dugas09a.html
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer. ISBN 978-0-387-30303-1.
Polyak, B. T. (1969). The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics, 9(4), 94–112. https://doi.org/10.1016/0041-5553(69)90035-4
Xu, X., & An, C. (2024). A trust region method with regularized Barzilai-Borwein step-size for large-scale unconstrained optimization. arXiv preprint. https://doi.org/10.48550/arXiv.2409.14383