GALAHAD 2.0.0

Author: Richard A. Feiss IV, Ph.D. Version: 2.0.0 License: MIT Institution: Minnesota Center for Prion Research and Outreach (MNPRO), University of Minnesota GitHub: https://github.com/RFeissIV/GALAHAD


Overview

GALAHAD is a gradient-based optimizer for smooth objectives over mixed-geometry parameter spaces — problems where some parameters must be positive (rates, concentrations, Hill coefficients) and others are unconstrained (location parameters, regression coefficients, log-EC50).

Version 2 replaces the hard-clamp positivity enforcement of v1 with a softplus reparameterization, ensuring that positivity constraints are handled analytically and that gradients remain well-defined at the constraint boundary. It also adds rho-based trust-region adaptation, relative function-stall detection, and a richer per-iteration diagnostic history.


What changed from v1

Feature v1 v2
Positivity constraint Hard clamp to eps_safe Softplus reparameterization
Parameter partition API {T, P, E} (mandatory) {positive, euclidean} (preferred)
Trust-region adaptation Binary accept/reject rho (actual/predicted reduction)
Stall detection Absolute only Absolute + relative (tol_f_rel)
History columns 8 (no rho, armijo count) 10 (adds armijo_iters, pred_red, rho)
Halpern averaging Yes (extra eval/iter) Removed
Lyapunov tracking Yes (explicit violations) Removed (rho history equivalent)
Exported helpers None galahad_numgrad(), galahad_parts()

Quick start

# install.packages("GALAHAD")
library(GALAHAD)

# Fit an exponential decay: y = A * exp(-k * t)
# Both A > 0 and k > 0
set.seed(1)
t <- seq(0, 5, by = 0.5)
y <- 3 * exp(-0.8 * t) + rnorm(length(t), sd = 0.05)

obj <- function(theta) sum((y - theta[1] * exp(-theta[2] * t))^2)
grd <- function(theta) {
  r  <- y - theta[1] * exp(-theta[2] * t)
  c(-2 * sum(r * exp(-theta[2] * t)),
    -2 * sum(r * (-t) * theta[1] * exp(-theta[2] * t)))
}

fit <- GALAHAD(
  V      = obj,
  gradV  = grd,
  theta0 = c(A = 2, k = 0.5),
  parts  = list(positive = c(1L, 2L), euclidean = integer(0))
)

fit$theta      # ~ c(3, 0.8)
fit$converged  # TRUE
fit$reason     # "GRAD_TOL" or "FUNC_STALL_*"

When you don’t have an analytical gradient

Use the built-in finite-difference helper:

grd_num <- function(theta) galahad_numgrad(obj, theta)

fit <- GALAHAD(obj, grd_num, theta0 = c(2, 0.5),
               parts = list(positive = c(1L, 2L), euclidean = integer(0)))

Migrating from v1

The calling convention is identical. The only change that may require code edits:

  1. theta0[positive] must be strictly > 0. v1 silently clamped zero or negative starting values; v2 raises an informative error. Fix by ensuring starting values are positive.

  2. parts form — either update to {positive, euclidean} or leave as {T, P, E} (both still work; T and P indices both map to positive).

  3. History schema — if your code accesses fit$history by column name, note that delta_V_rel and lyapunov_ok are removed; armijo_iters, pred_red, and rho are added.


Algorithmic summary

Component Description
Softplus reparameterization θᵢ = log(1 + eᶻⁱ); gradient corrected by sigmoid chain rule
Diagonal preconditioning Curvature estimate via secant EMA: 0.8 L + 0.2 |y/s|
Step-size selection Polyak → Barzilai-Borwein (BB2) → constant default
Armijo backtracking Sufficient decrease with configurable max backtrack steps
Trust-region projection Scaled M-norm; radius adapted by rho ratio
Convergence Gradient + step tolerance; absolute + relative function stall

Applications


Development transparency

Development followed an iterative human-machine collaboration where all algorithmic design, statistical methodologies, and biological validation logic were conceptualized, tested, and iteratively refined by Richard A. Feiss through repeated cycles of running experimental data, evaluating analytical outputs, and selecting among candidate algorithms and approaches. AI systems (‘Anthropic Claude’ and ‘OpenAI GPT’) served as coding assistants and analytical sounding boards under continuous human direction. The selection of statistical methods, evaluation of biological plausibility, and all final methodology decisions were made by the human author. AI systems did not independently originate algorithms, statistical approaches, or scientific methodologies.


References

Barzilai, J., & Borwein, J. M. (1988). Two-point step size gradient methods. IMA Journal of Numerical Analysis, 8(1), 141–148. https://doi.org/10.1093/imanum/8.1.141

Conn, A. R., Gould, N. I. M., & Toint, P. L. (2000). Trust-Region Methods. SIAM. https://doi.org/10.1137/1.9780898719857

Dugas, C., Bengio, Y., Belisle, F., Nadeau, C., & Garcia, R. (2009). Incorporating functional knowledge in neural networks. Journal of Machine Learning Research, 10(42), 1239–1262. https://www.jmlr.org/papers/v10/dugas09a.html

Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer. ISBN 978-0-387-30303-1.

Polyak, B. T. (1969). The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics, 9(4), 94–112. https://doi.org/10.1016/0041-5553(69)90035-4

Xu, X., & An, C. (2024). A trust region method with regularized Barzilai-Borwein step-size for large-scale unconstrained optimization. arXiv preprint. https://doi.org/10.48550/arXiv.2409.14383