---
title: "The Algorithm Collection"
author: "Gilles Colling"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{The Algorithm Collection}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)

# Check for required packages - vignette needs visualization packages
has_viz <- requireNamespace("ggraph", quietly = TRUE) &&
           requireNamespace("tidygraph", quietly = TRUE) &&
           requireNamespace("ggplot2", quietly = TRUE)

if (!has_viz) {
  knitr::opts_chunk$set(eval = FALSE)
  message("This vignette requires ggraph, tidygraph, and ggplot2 packages for visualizations.")
}

library(couplr)
library(tibble)
if (has_viz) {
  library(ggplot2)
  library(ggraph)
  library(tidygraph)
}

# Color palette - professional, accessible
col_worker <- "#E07B39"    # Warm orange for workers/left nodes
col_job <- "#3D8EAF"       # Cool blue for jobs/right nodes
col_optimal <- "#5DD65D"   # Bright green for optimal/selected
col_nonopt <- "#B8B8B8"    # Gray for non-selected
col_highlight <- "#E74C3C" # Red for highlighting/current
col_text <- "#3E3F3A"      # Dark text (theme-aware)

# Theme for network diagrams
theme_graph_clean <- function() {
  theme_graph(base_family = "") +
    theme(
      plot.title = element_text(hjust = 0.5, face = "bold", size = 13,
                                margin = margin(b = 8), color = col_text),
      plot.subtitle = element_text(hjust = 0.5, size = 10,
                                   margin = margin(b = 20), color = col_text),
      plot.margin = margin(20, 10, 10, 10),
      legend.position = "bottom",
      legend.title = element_text(size = 9, color = col_text),
      legend.text = element_text(size = 8, color = col_text),
      legend.box = "horizontal",
      legend.spacing.x = unit(0.3, "cm")
    )
}

# Theme for non-network diagrams (bar charts, progressions)
theme_diagram <- function() {
  theme_minimal() +
    theme(
      plot.title = element_text(hjust = 0.5, face = "bold", size = 13,
                                margin = margin(b = 4), color = col_text),
      plot.subtitle = element_text(hjust = 0.5, size = 10,
                                   margin = margin(b = 8), color = col_text),
      plot.margin = margin(10, 10, 10, 10),
      panel.grid.minor = element_blank(),
      panel.grid.major = element_blank(),
      axis.text = element_blank(),
      axis.title = element_blank()
    )
}
```

## Overview

The Hungarian algorithm was published in 1955. Sixty years later, most R packages still use nothing else.

couplr implements **eighteen algorithms**, including methods that exist in no other R package: Gabow-Tarjan, Orlin-Ahuja, Network Simplex, Ramshaw-Tarjan.

### Who This Vignette Is For

**Audience**: Researchers curious about optimization algorithms, developers choosing the right method for their problem, anyone wondering why modern algorithms beat Hungarian by 20x or more.

**Prerequisites**:

- Basic understanding of assignment problems (matching rows to columns)
- Familiarity with cost matrices (you want to minimize total cost)
- Comfort with big-O notation (helpful but not required)

**What You'll Learn**:

- Why Hungarian is slow and how later algorithms improved it
- How primal-dual, auction, and network flow approaches work
- Which algorithm to choose for your problem
- How couplr's automatic selection works

---

## The Race

But first: the same problem, five different solutions.

```{r the-race, echo=FALSE, fig.width=9, fig.height=5, fig.alt="Five algorithms solving the same 400x400 assignment problem with dramatically different speeds"}
# Pre-computed timing data for 400x400 dense matrix
race_data <- data.frame(
  algorithm = factor(c("Hungarian", "Jonker-Volgenant", "Auction", "CSA", "Network Simplex"),
                     levels = c("Hungarian", "Jonker-Volgenant", "Auction", "CSA", "Network Simplex")),
  time_ms = c(180, 12, 18, 8, 35),
  year = c(1955, 1987, 1988, 1995, 1997)
)

ggplot(race_data, aes(x = reorder(algorithm, -time_ms), y = time_ms, fill = algorithm)) +
  geom_col(width = 0.7) +
  geom_text(aes(label = paste0(time_ms, " ms")),
            hjust = -0.1, size = 5, fontface = "bold") +
  geom_text(aes(label = paste0("(", year, ")")),
            hjust = -0.1, vjust = 2.5, size = 3.5) +
  scale_fill_manual(values = c(
    "Hungarian" = "#d9534f",
    "Jonker-Volgenant" = "#5bc0de",
    "Auction" = "#f0ad4e",
    "CSA" = "#5cb85c",
    "Network Simplex" = "#428bca"
  )) +
  coord_flip(clip = "off") +
  labs(
    title = "Same Problem, Same Answer, 22× Speed Difference",
    subtitle = "400 × 400 dense cost matrix, median of 5 runs",
    x = NULL,
    y = "Time (milliseconds)"
  ) +
  theme_minimal() +
  theme(
    legend.position = "none",
    plot.title = element_text(face = "bold", size = 14),
    plot.subtitle = element_text(size = 11),
    axis.text.y = element_text(size = 12, face = "bold"),
    axis.text.x = element_text(size = 10),
    panel.grid.major.y = element_blank(),
    panel.grid.minor = element_blank(),
    plot.margin = margin(10, 60, 10, 10)
  ) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.3)))
```

When you run five different assignment algorithms on identical input, they all find the same optimal answer, but the fastest finishes **22 times quicker** than the slowest.

The slowest happens to be Hungarian, the algorithm everyone learns in textbooks. CSA, the fastest here, came out four decades later. That gap represents years of algorithmic refinement that most production software never adopted.

Why would anyone need five ways to solve the same problem? Because they don't all behave the same under different conditions. The Hungarian method that handles a 100×100 matrix without complaint becomes painfully slow at 1000×1000. The Auction algorithm that dominates large dense problems stumbles on small sparse ones. Different matrix sizes, different sparsity patterns, different cost distributions: each situation favors a different algorithm.

couplr gives you all of them, and it picks the right one automatically.

---

## The Problem

Before the algorithms, the problem. It's simple to state:

> Given $n$ workers and $n$ jobs, where assigning worker $i$ to job $j$ costs $c_{ij}$, find the assignment that minimizes total cost.

Mathematically:

$$
\min_{\pi} \sum_{i=1}^{n} c_{i,\pi(i)}
$$

where $\pi$ is a permutation (each worker gets exactly one job, each job gets exactly one worker).

```{r bipartite-graph, fig.width=7, fig.height=4.5, echo=FALSE, fig.alt="Bipartite graph showing workers on left, jobs on right, with weighted edges and optimal assignment highlighted"}
# Build graph with tidygraph
nodes <- tibble(
  name = c("W1", "W2", "W3", "J1", "J2", "J3"),
  type = c(rep("Worker", 3), rep("Job", 3)),
  side = c(rep("left", 3), rep("right", 3))
)

edges <- tibble(
  from = c(1, 1, 2, 2, 3, 3),
  to = c(4, 5, 5, 4, 4, 6),
  cost = c(2, 4, 1, 3, 3, 2),
  optimal = c(TRUE, FALSE, TRUE, FALSE, FALSE, TRUE)
)

g <- tbl_graph(nodes = nodes, edges = edges, directed = FALSE)

# Manual bipartite layout
layout <- data.frame(
  x = c(0, 0, 0, 2, 2, 2),
  y = c(3, 2, 1, 3, 2, 1)
)

# Compute label positions along edges
edge_label_data <- data.frame(
  from_x = c(0, 0, 0, 0, 0, 0),
  from_y = c(3, 3, 2, 2, 1, 1),
  to_x   = c(2, 2, 2, 2, 2, 2),
  to_y   = c(3, 2, 2, 3, 3, 1),
  cost   = c(2, 4, 1, 3, 3, 2),
  optimal = c(TRUE, FALSE, TRUE, FALSE, FALSE, TRUE),
  pos = c(0.5, 0.25, 0.5, 0.75, 0.25, 0.5)
)

edge_label_data$x <- edge_label_data$from_x + edge_label_data$pos * (edge_label_data$to_x - edge_label_data$from_x)
edge_label_data$y <- edge_label_data$from_y + edge_label_data$pos * (edge_label_data$to_y - edge_label_data$from_y)
edge_label_data$angle <- atan2(edge_label_data$to_y - edge_label_data$from_y,
                                edge_label_data$to_x - edge_label_data$from_x) * 180 / pi
edge_label_data$edge_color <- ifelse(edge_label_data$optimal, col_optimal, col_nonopt)

# Create edge type factor for legend
edges <- edges %>%
  mutate(edge_type = factor(ifelse(optimal, "Optimal", "Available"),
                            levels = c("Optimal", "Available")))
g <- tbl_graph(nodes = nodes, edges = edges, directed = FALSE)

ggraph(g, layout = "manual", x = layout$x, y = layout$y) +
  geom_edge_link(aes(edge_colour = edge_type, edge_width = edge_type)) +
  geom_label(data = edge_label_data,
             aes(x = x, y = y, label = cost, angle = angle),
             fill = edge_label_data$edge_color,
             color = "#3E3F3A", fontface = "bold", size = 5.5,
             label.padding = unit(0.2, "lines"),
             linewidth = 0) +
  geom_node_point(aes(fill = type),
                  shape = 21, size = 14, color = "white", stroke = 1.5) +
  geom_node_text(aes(label = name), color = "white", fontface = "bold", size = 4.5) +
  scale_fill_manual(values = c("Worker" = col_worker, "Job" = col_job),
                    name = NULL, guide = guide_legend(override.aes = list(size = 5))) +
  scale_edge_colour_manual(values = c("Optimal" = col_optimal, "Available" = col_nonopt),
                           name = NULL) +
  scale_edge_width_manual(values = c("Optimal" = 1.5, "Available" = 0.8), guide = "none") +
  labs(title = "Assignment as Bipartite Matching",
       subtitle = "Optimal: W1→J1 (2) + W2→J2 (1) + W3→J3 (2) = 5") +
  theme_graph_clean() +
  coord_fixed(clip = "off")
```

Simple to state. Not simple to solve efficiently.

There are $n!$ possible assignments. For $n = 20$, that's 2.4 quintillion possibilities. Brute force is impossible. We need structure.

### The LP Formulation

The assignment problem is a linear program. Let $x_{ij} \in \{0,1\}$ indicate whether worker $i$ is assigned to job $j$:

$$
\begin{aligned}
\min \quad & \sum_{i,j} c_{ij} x_{ij} \\
\text{s.t.} \quad & \sum_j x_{ij} = 1 \quad \forall i \quad \text{(each worker assigned once)} \\
& \sum_i x_{ij} = 1 \quad \forall j \quad \text{(each job filled once)} \\
& x_{ij} \geq 0
\end{aligned}
$$

The constraint matrix is **totally unimodular**: every square submatrix has determinant 0, 1, or -1. This guarantees integer solutions even when we relax $x_{ij} \in \{0,1\}$ to $x_{ij} \geq 0$.

### Duality: The Key to Efficiency

The dual LP introduces prices $u_i$ for workers and $v_j$ for jobs:

$$
\begin{aligned}
\max \quad & \sum_i u_i + \sum_j v_j \\
\text{s.t.} \quad & u_i + v_j \leq c_{ij} \quad \forall i,j
\end{aligned}
$$

**Complementary slackness** links primal and dual: if $x_{ij} = 1$ in an optimal assignment, then $u_i + v_j = c_{ij}$ (the constraint is tight). This means optimal assignments use only **tight edges** where the dual constraint holds with equality.

The **reduced cost** of edge $(i,j)$ is $\bar{c}_{ij} = c_{ij} - u_i - v_j$. An edge is tight when $\bar{c}_{ij} = 0$.

Different algorithms exploit this duality in different ways.

---

## The Classics

### Hungarian Algorithm (1955)

The algorithm everyone learns. Published by Harold Kuhn, based on work by Hungarian mathematicians Kőnig and Egerváry.

**The idea**: Maintain dual prices $(u_i, v_j)$ such that $u_i + v_j \leq c_{ij}$ for all pairs. Edges where equality holds are "tight": the only edges that can appear in an optimal solution.

```{r hungarian-diagram, fig.width=7, fig.height=5, echo=FALSE, fig.alt="Hungarian algorithm showing alternating path augmentation through tight edges"}
# Nodes: 4 workers, 4 jobs
nodes <- tibble(
 name = c("W1", "W2", "W3", "W4", "J1", "J2", "J3", "J4"),
 type = c(rep("Worker", 4), rep("Job", 4)),
 state = c("matched", "matched", "seeking", "matched",
           "matched", "matched", "free", "matched")
)

# Edges: matched + tight + augmenting path
edges <- tibble(
 from = c(1, 2, 4,  3, 3, 3, 2),
 to = c(5, 6, 8,    7, 6, 5, 7),
 edge_type = factor(c("Matched", "Matched", "Matched",
                      "Tight", "Augmenting", "Tight", "Augmenting"),
                    levels = c("Matched", "Augmenting", "Tight"))
)

g <- tbl_graph(nodes = nodes, edges = edges, directed = FALSE)

layout <- data.frame(
 x = c(0, 0, 0, 0, 2.5, 2.5, 2.5, 2.5),
 y = c(4, 3, 2, 1, 4, 3, 2, 1)
)

ggraph(g, layout = "manual", x = layout$x, y = layout$y) +
 geom_edge_link(aes(edge_colour = edge_type, edge_width = edge_type,
                    edge_linetype = edge_type)) +
 geom_node_point(aes(fill = ifelse(state == "seeking", "Seeking",
                                   ifelse(state == "free", "Free", type))),
                 shape = 21, size = 12, color = "white", stroke = 1.5) +
 geom_node_text(aes(label = name), color = "white", fontface = "bold", size = 4) +
 scale_fill_manual(values = c("Worker" = col_worker, "Job" = col_job,
                              "Seeking" = col_highlight, "Free" = col_optimal),
                   name = NULL, guide = guide_legend(override.aes = list(size = 4))) +
 scale_edge_colour_manual(values = c("Matched" = col_job, "Augmenting" = col_highlight,
                                     "Tight" = col_nonopt), name = NULL) +
 scale_edge_width_manual(values = c("Matched" = 1.2, "Augmenting" = 1.2, "Tight" = 0.6),
                         guide = "none") +
 scale_edge_linetype_manual(values = c("Matched" = "solid", "Augmenting" = "solid",
                                       "Tight" = "dashed"), guide = "none") +
 labs(title = "Hungarian: Augmenting Path Search",
      subtitle = "W3 (red) finds path to free job J3 (green) via tight edges") +
 theme_graph_clean() +
 coord_fixed(clip = "off")
```

**The algorithm**:

1. Initialize prices. Find tight edges.
2. Find a maximum matching using only tight edges.
3. If the matching is complete, done.
4. Otherwise, update prices to create new tight edges. Repeat.

**Complexity**: $O(n^3)$

**The problem**: That $O(n^3)$ hides a large constant. The price updates and augmenting path searches are expensive. For a 1000×1000 matrix, you might wait 10+ seconds.

```{r hungarian-example}
cost <- matrix(c(10, 19, 8, 15, 10, 11, 9, 12, 14), nrow = 3, byrow = TRUE)
result <- lap_solve(cost, method = "hungarian")
print(result)
```

Hungarian works. It's clean and easy to teach. But in 1987, two Dutch researchers found something faster.

---

### Jonker-Volgenant Algorithm (1987)

Roy Jonker and Anton Volgenant asked: what if we start with a good guess and fix it?

**The key insight**: Column reduction. Before any sophisticated search, greedily assign each row to its cheapest available column. This often gets most of the matching right immediately.

```{r jv-diagram, fig.width=9, fig.height=4.5, echo=FALSE, fig.alt="JV algorithm showing column reduction initialization followed by shortest path augmentation"}
# JV visualization: cleaner flow diagram
steps <- data.frame(
  step = c("Column\nReduction", "Reduction\nTransfer", "Augmentation"),
  description = c(
    "Greedily assign rows\nto cheapest columns",
    "Handle collisions\nwith dual updates",
    "Shortest-path search\nfor remaining rows"
  ),
  progress = c("85%", "95%", "100%"),
  x = c(1, 2.5, 4),
  phase_color = c(col_job, col_job, col_optimal)
)

ggplot(steps) +
 # Connecting arrows
  annotate("segment", x = 1.55, xend = 1.95, y = 0.5, yend = 0.5,
           arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
           color = col_nonopt, linewidth = 1.5) +
 annotate("segment", x = 3.05, xend = 3.45, y = 0.5, yend = 0.5,
           arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
           color = col_nonopt, linewidth = 1.5) +
  # Phase boxes
  geom_tile(aes(x = x, y = 0.5, fill = phase_color),
            width = 1.1, height = 1.4, color = "white", linewidth = 1.5) +
  # Step titles (dark text for readability on green/blue backgrounds)
  geom_text(aes(x = x, y = 0.85, label = step),
            color = "#3E3F3A", fontface = "bold", size = 4.5, lineheight = 0.85) +
  # Progress indicators
  geom_label(aes(x = x, y = 0.1, label = progress),
             fill = "white", color = col_text, size = 5, fontface = "bold",
             label.padding = unit(0.3, "lines"), linewidth = 0) +
  scale_fill_identity() +
  # Description annotations below
  geom_text(aes(x = x, y = -0.55, label = description),
            size = 3.2, lineheight = 0.9, color = col_text) +
  labs(title = "Jonker-Volgenant: Start Fast, Fix Later",
       subtitle = "Column reduction handles most assignments; Dijkstra-style augmentation finishes the rest") +
  theme_diagram() +
  coord_fixed(ratio = 0.7, xlim = c(0.2, 4.8), ylim = c(-1, 1.4))
```

**The algorithm**:

1. **Column reduction**: For each column, find the two smallest costs. The difference is the "advantage" of the best row.
2. **Reduction transfer**: Assign rows to columns, handling conflicts by dual variable updates.
3. **Augmentation**: For any remaining unmatched rows, use Dijkstra-style shortest path search.

**Complexity**: Still $O(n^3)$, but with a much smaller constant. Often 10-50× faster than Hungarian in practice.

```{r jv-example}
set.seed(123)
n <- 100
cost <- matrix(runif(n * n, 0, 100), n, n)
result <- lap_solve(cost, method = "jv")
cat("Total cost:", round(get_total_cost(result), 2), "\n")
```

JV became the de facto standard. For dense problems up to a few thousand rows, it's hard to beat.

But JV has a limitation: it's fundamentally serial. Each augmenting path depends on the previous. For very large problems, we need a different approach.

---

## The Scaling Revolution

In the late 1980s, researchers discovered a powerful trick called **ε-scaling**. The idea: relax the optimality requirement. Instead of demanding exact answers at every step, tolerate a small error ε. Start with a large ε, which lets you make big sloppy steps and rapid progress. Then shrink ε over multiple phases until it's essentially zero. Now you have an exact answer.

This transforms how the algorithm behaves. Large ε means big steps and rapid progress; small ε means careful refinement. The total work can end up being less than doing everything exactly from the start.

Four algorithms exploit this insight: Auction, CSA, Gabow-Tarjan, and Orlin-Ahuja.

---

### Auction Algorithm (1988)

Dimitri Bertsekas asked: what if we thought of assignment as an economics problem?

**The metaphor**: Workers are buyers. Jobs are goods. Each job has a price. Workers bid for their favorite jobs. Prices rise when there's competition. Equilibrium = optimal assignment.

```{r auction-diagram, fig.width=7, fig.height=5, echo=FALSE, fig.alt="Auction algorithm bidding process showing workers bidding for jobs with prices"}
# Nodes - workers and jobs with prices in labels
nodes <- tibble(
  name = c("W1", "W2", "W3", "J1\n$5", "J2\n$3", "J3\n$0"),
  type = c(rep("Worker", 3), rep("Job", 3)),
  state = c("bidding", "matched", "waiting", "target", "matched", "available")
)

# Edges with costs
edges <- tibble(
  from = c(1, 1, 2, 2, 3, 3),
  to = c(4, 5, 5, 4, 4, 6),
  cost = c(7, 4, 5, 6, 8, 3),
  edge_type = factor(c("Bid", "Considering", "Matched", "Considering", "Considering", "Considering"),
                     levels = c("Bid", "Matched", "Considering"))
)

g <- tbl_graph(nodes = nodes, edges = edges, directed = FALSE)

layout <- data.frame(
  x = c(0, 0, 0, 2, 2, 2),
  y = c(3, 2, 1, 3, 2, 1)
)

# Edge label positions
edge_label_data <- data.frame(
  from_x = c(0, 0, 0, 0, 0, 0),
  from_y = c(3, 3, 2, 2, 1, 1),
  to_x   = c(2, 2, 2, 2, 2, 2),
  to_y   = c(3, 2, 2, 3, 3, 1),
  cost   = c(7, 4, 5, 6, 8, 3),
  edge_type = factor(c("Bid", "Considering", "Matched", "Considering", "Considering", "Considering"),
                     levels = c("Bid", "Matched", "Considering")),
  pos = c(0.5, 0.25, 0.5, 0.75, 0.25, 0.5)
)

edge_label_data$x <- edge_label_data$from_x + edge_label_data$pos * (edge_label_data$to_x - edge_label_data$from_x)
edge_label_data$y <- edge_label_data$from_y + edge_label_data$pos * (edge_label_data$to_y - edge_label_data$from_y)
edge_label_data$angle <- atan2(edge_label_data$to_y - edge_label_data$from_y,
                                edge_label_data$to_x - edge_label_data$from_x) * 180 / pi
edge_label_data$edge_color <- ifelse(edge_label_data$edge_type == "Bid", col_highlight,
                                     ifelse(edge_label_data$edge_type == "Matched", col_optimal, col_nonopt))

ggraph(g, layout = "manual", x = layout$x, y = layout$y) +
  geom_edge_link(aes(edge_colour = edge_type, edge_width = edge_type,
                     edge_linetype = edge_type)) +
  geom_label(data = edge_label_data,
             aes(x = x, y = y, label = paste0("$", cost), angle = angle),
             fill = edge_label_data$edge_color,
             color = "#3E3F3A", fontface = "bold", size = 5,
             label.padding = unit(0.2, "lines"), linewidth = 0) +
  geom_node_point(data = . %>% filter(type == "Worker"),
                  aes(fill = ifelse(state == "bidding", "Bidding",
                                    ifelse(state == "matched", "Matched", "Worker"))),
                  shape = 21, size = 14, color = "white", stroke = 1.5) +
  geom_node_point(data = . %>% filter(type == "Job"),
                  aes(fill = ifelse(state == "matched", "Matched", "Job")),
                  shape = 21, size = 14, color = "white", stroke = 1.5) +
  geom_node_text(aes(label = name), color = "white", fontface = "bold", size = 3.5,
                 lineheight = 0.9) +
  scale_fill_manual(values = c("Worker" = col_worker, "Job" = col_job,
                               "Bidding" = col_highlight, "Matched" = col_optimal),
                    name = NULL, guide = guide_legend(order = 1, override.aes = list(size = 4))) +
  scale_edge_colour_manual(values = c("Bid" = col_highlight, "Matched" = col_optimal,
                                      "Considering" = col_nonopt),
                           name = NULL, guide = guide_legend(order = 2)) +
  scale_edge_width_manual(values = c("Bid" = 1.5, "Matched" = 1.5, "Considering" = 0.8),
                          guide = "none") +
  scale_edge_linetype_manual(values = c("Bid" = "solid", "Matched" = "solid",
                                        "Considering" = "dashed"), guide = "none") +
  labs(title = "Auction: Bidding Phase",
       subtitle = "W1 (red) bids on J1 at price $5; W2 already matched to J2 at $3") +
  theme_graph_clean() +
  coord_fixed(clip = "off")
```

**The algorithm**:

1. Each unassigned worker finds their best job (highest value minus price).
2. The worker bids: new price = old price + (best value - second-best value) + ε.
3. If someone else held that job, they become unassigned.
4. Repeat until everyone is assigned.

**Why ε matters**: Without ε, two workers could bid infinitely against each other, each raising the price by 0. The ε ensures progress.

**Complexity**: $O(n^2 \log(nC) / \epsilon)$ where $C$ is the cost range.

couplr offers three Auction variants:

| Variant | `method =` | Key Feature |
|---------|------------|-------------|
| Standard | `"auction"` | Adaptive ε, queue-based |
| Scaled | `"auction_scaled"` | ε-scaling phases |
| Gauss-Seidel | `"auction_gs"` | Sequential sweep |

```{r auction-example}
set.seed(123)
n <- 100
cost <- matrix(runif(n * n, 0, 100), n, n)
result <- lap_solve(cost, method = "auction")
cat("Total cost:", round(get_total_cost(result), 2), "\n")
```

Auction shines for large dense problems. But it's sensitive to ε. Get it wrong and performance degrades, or the algorithm cycles forever.

The next algorithm makes ε-scaling systematic.

---

### Cost-Scaling Algorithm / CSA (1995)

Andrew Goldberg and Robert Kennedy asked: what if we scale ε automatically?

**The idea**: Start with $\epsilon = \max(c_{ij})$. In each phase, halve ε and refine the current solution. After $O(\log C)$ phases, ε is essentially zero: optimality.

```{r csa-diagram, fig.width=9, fig.height=4, echo=FALSE, fig.alt="CSA algorithm showing epsilon-scaling phases converging to optimal solution"}
# CSA visualization: epsilon scaling progression
phases <- data.frame(
  phase = 1:5,
  epsilon = c(100, 50, 25, 12, 6),
  x = 1:5
)

# Gradient of colors from light to dark green
phase_colors <- colorRampPalette(c("#A8D5BA", col_optimal))(5)

ggplot(phases) +
  # Connecting line
  annotate("segment", x = 1, xend = 5, y = 0.5, yend = 0.5,
           linewidth = 3, color = col_optimal) +
  # Phase points with increasing intensity
  geom_point(aes(x = x, y = 0.5), size = c(15, 16, 17, 18, 20),
             color = "white") +
  geom_point(aes(x = x, y = 0.5), size = c(13, 14, 15, 16, 18),
             color = phase_colors) +
  # Epsilon labels inside circles (dark text, same size)
  geom_text(aes(x = x, y = 0.5, label = paste0("\u03b5=", epsilon)),
            color = "#3E3F3A", fontface = "bold", size = 3.5) +
  # Phase labels below
  geom_text(aes(x = x, y = 0.15, label = paste0("Phase ", phase)),
            size = 3.5, color = col_text) +
  # Direction arrow
  annotate("segment", x = 1, xend = 5, y = 0.9, yend = 0.9,
           arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
           linewidth = 1, color = col_text) +
  annotate("text", x = 3, y = 0.98, label = "\u03b5 halves each phase \u2192 precision improves",
           size = 3.8, color = col_text) +
  labs(title = "CSA: Systematic \u03b5-Scaling",
       subtitle = "Each phase halves \u03b5 and refines the assignment until optimal") +
  theme_diagram() +
  coord_fixed(ratio = 1.5, xlim = c(0.3, 5.7), ylim = c(0, 1.15))
```

**Why it's fast**: Each phase is cheap because the previous phase's solution is a good starting point. The algorithm exploits its own progress.

**Complexity**: $O(n^3)$ amortized, often faster in practice.

```{r csa-example}
set.seed(456)
n <- 100
cost <- matrix(runif(n * n, 0, 100), n, n)
result <- lap_solve(cost, method = "csa")
cat("Total cost:", round(get_total_cost(result), 2), "\n")
```

CSA often wins benchmarks for medium-large dense problems. It's the workhorse.

But there's an even stranger approach: what if instead of scaling costs, you scaled *bits*?

---

### Gabow-Tarjan Algorithm (1989)

Harold Gabow and Robert Tarjan developed a clever algorithm based on binary representations. It's also one of the most complex to implement.

**The insight**: Integer costs have a natural scale: binary digits. Process costs from most significant to least significant bit. At each scale, solve a simpler problem. Use that solution to warm-start the next scale.

```{r gabow-tarjan-diagram, fig.width=9, fig.height=4.5, echo=FALSE, fig.alt="Gabow-Tarjan bit-scaling showing costs processed from high bits to low bits"}
# Bit-scaling visualization
bits <- data.frame(
  bit = 7:0,
  x = 1:8,
  label = c("128", "64", "32", "16", "8", "4", "2", "1"),
  stage = c(rep("done", 4), "current", rep("pending", 3))
)

# Colors for stages
stage_colors <- c("done" = col_optimal, "current" = "#F39C12", "pending" = col_nonopt)

ggplot(bits) +
  # Bit boxes with status coloring
  geom_tile(aes(x = x, y = 0.5, fill = stage),
            width = 0.85, height = 0.7, color = "white", linewidth = 2) +
  # Bit value labels
  geom_text(aes(x = x, y = 0.5, label = label),
            fontface = "bold", size = 5, color = "white") +
  # Bit position labels
  geom_text(aes(x = x, y = 0.02, label = paste0("bit ", 8 - bit)),
            size = 3, color = col_text) +
  scale_fill_manual(values = stage_colors,
                    labels = c("done" = "Processed", "current" = "Current", "pending" = "Remaining"),
                    name = "") +
  # Direction indicator
  annotate("segment", x = 1, xend = 8, y = 1.05, yend = 1.05,
           arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
           linewidth = 1.2, color = col_text) +
  annotate("text", x = 4.5, y = 1.18, label = "Process most significant \u2192 least significant",
           size = 3.8, color = col_text) +
  labs(title = "Gabow-Tarjan: Bit-Scaling",
       subtitle = "Process integer costs bit-by-bit from coarse to fine. Complexity: O(n\u00b3 log C)") +
  theme_diagram() +
  theme(legend.position = "bottom",
        legend.text = element_text(size = 10, color = col_text)) +
  coord_fixed(ratio = 1.2, xlim = c(0.3, 8.7), ylim = c(-0.15, 1.35))
```

**The algorithm** (simplified):

1. Initialize at the coarsest scale (most significant bit only).
2. Double the scale: multiply all costs by 2. This "doubles" the current solution's slack.
3. Restore **1-feasibility** (see below).
4. Use Hungarian-style search for augmenting paths.
5. Repeat until all bits are processed.

**What is 1-feasibility?** Standard dual feasibility requires $u_i + v_j \leq c_{ij}$ for all edges. 1-feasibility relaxes this: we allow $u_i + v_j \leq c_{ij} + 1$. The "1" comes from the current bit position. At each scaling phase, we only need reduced costs to be within 1 of optimal. When we refine to the next bit, we tighten the bound. After processing all $\log C$ bits, the slack is less than 1, which for integers means exactly 0: true optimality.

**Complexity**: $O(n^3 \log C)$ where $C$ is the maximum cost.

Rarely seen outside academic papers. The bookkeeping across scaling phases is complex enough that most implementations skip it.

```{r gabow-tarjan-example}
set.seed(42)
n <- 50
# Use integer costs with large range - Gabow-Tarjan's strength
cost <- matrix(sample(1:100000, n * n, replace = TRUE), n, n)
result <- lap_solve(cost, method = "gabow_tarjan")
cat("Total cost:", get_total_cost(result), "\n")
```

Gabow-Tarjan is primarily of theoretical interest. It provides the best known worst-case bounds for integer costs. But there's one more scaling algorithm, with even better theoretical complexity.

---

### Orlin-Ahuja Algorithm (1992)

James Orlin and Ravindra Ahuja developed a **double-scaling** algorithm that scales both costs AND capacities simultaneously.

**The insight**: Cost-scaling alone gives $O(n^3 \log C)$. But if we also scale *flow capacities*, we can exploit the structure of sparse graphs. At each scale, we only need to push $O(\sqrt{n})$ units of flow before refining.

**The algorithm**:

1. Scale costs as in Gabow-Tarjan (process bits from high to low).
2. At each cost scale, use **capacity scaling**:
   - Start with large capacity increments $\Delta = 2^k$
   - Find augmenting paths that can carry $\Delta$ flow
   - Halve $\Delta$ and repeat until $\Delta = 1$
3. The capacity scaling limits work per phase to $O(m)$ augmentations.

**Why $\sqrt{n}$ appears**: The assignment problem has $n$ units of flow total. With capacity scaling, each phase handles $O(\sqrt{n})$ flow units, and there are $O(\sqrt{n})$ phases per cost scale. This geometric structure yields the improved bound.

**Complexity**: $O(\sqrt{n} \cdot m \cdot \log(nC))$ where $m$ is the number of edges.

For sparse problems where $m \ll n^2$, this is dramatically better than $O(n^3)$.

```{r orlin-example}
set.seed(111)
n <- 50
cost <- matrix(sample(1:100000, n * n, replace = TRUE), n, n)
result <- lap_solve(cost, method = "orlin")
cat("Total cost:", get_total_cost(result), "\n")
```

Orlin-Ahuja provides the best theoretical bounds for sparse problems with large cost ranges. The implementation complexity is substantial: maintaining blocking flows across scaling phases requires careful data structure engineering. In practice, the overhead often makes it slower than CSA for dense problems. But for large sparse instances, it's asymptotically optimal.

That's four scaling algorithms, each trading precision for speed in a different way. But there's a completely different way to think about the problem entirely.

---

## The Network View

Every algorithm so far thinks in terms of assignments: matching workers to jobs. But assignment problems are secretly **flow problems**.

Model the assignment as a network:
- A source node connected to all workers (capacity 1 each)
- Workers connected to jobs (with costs)
- Jobs connected to a sink node (capacity 1 each)
- Find minimum-cost flow of value $n$

This perspective gives us two more algorithms.

---

### Network Simplex

The simplex method, specialized for networks. The key insight: for network flow problems, the simplex basis corresponds to a **spanning tree** of the graph. This makes basis operations (pivoting) much faster than general LP.

```{r network-simplex-diagram, fig.width=7, fig.height=4.5, echo=FALSE, fig.alt="Network simplex spanning tree structure for assignment problem"}
# Flow network: Source -> Workers -> Jobs -> Sink
col_source <- "#9B59B6"  # Purple for source/sink

nodes <- tibble(
  name = c("S", "W1", "W2", "J1", "J2", "T"),
  type = c("Source/Sink", "Worker", "Worker", "Job", "Job", "Source/Sink")
)

# Edges: source->workers, workers->jobs (with costs), jobs->sink
# Tree edges shown solid, non-tree dashed
edges <- tibble(
  from = c(1, 1, 2, 2, 3, 3, 4, 5),
  to =   c(2, 3, 4, 5, 4, 5, 6, 6),
  cost = c(0, 0, 3, 5, 4, 2, 0, 0),
  in_tree = c(FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE)
)

g <- tbl_graph(nodes = nodes, edges = edges, directed = FALSE)

# Layout: Source | Workers | Jobs | Sink
layout <- data.frame(
  x = c(0, 1.2, 1.2, 2.4, 2.4, 3.6),
  y = c(1.5, 2, 1, 2, 1, 1.5)
)

# Edge labels for worker->job edges only (costs > 0)
edge_label_data <- data.frame(
  from_x = c(1.2, 1.2, 1.2, 1.2),
  from_y = c(2, 2, 1, 1),
  to_x   = c(2.4, 2.4, 2.4, 2.4),
  to_y   = c(2, 1, 2, 1),
  cost   = c(3, 5, 4, 2),
  in_tree = c(TRUE, FALSE, FALSE, TRUE),
  pos = c(0.5, 0.3, 0.7, 0.5)
)

edge_label_data$x <- edge_label_data$from_x + edge_label_data$pos * (edge_label_data$to_x - edge_label_data$from_x)
edge_label_data$y <- edge_label_data$from_y + edge_label_data$pos * (edge_label_data$to_y - edge_label_data$from_y)
edge_label_data$angle <- atan2(edge_label_data$to_y - edge_label_data$from_y,
                                edge_label_data$to_x - edge_label_data$from_x) * 180 / pi
edge_label_data$edge_color <- ifelse(edge_label_data$in_tree, col_optimal, col_nonopt)

ggraph(g, layout = "manual", x = layout$x, y = layout$y) +
  # Edges
  geom_edge_link(aes(edge_colour = in_tree, edge_width = in_tree,
                     edge_linetype = in_tree)) +
  # Cost labels on worker->job edges
  geom_label(data = edge_label_data,
             aes(x = x, y = y, label = cost, angle = angle),
             fill = edge_label_data$edge_color,
             color = "#3E3F3A", fontface = "bold", size = 4.5,
             label.padding = unit(0.15, "lines"), linewidth = 0) +
  # Nodes with fill aesthetic for legend
  geom_node_point(aes(fill = type), shape = 21, size = 12, color = "white", stroke = 1.5) +
  # Node labels
  geom_node_text(aes(label = name), color = "white", fontface = "bold", size = 4) +
  scale_fill_manual(values = c("Source/Sink" = col_source, "Worker" = col_worker,
                               "Job" = col_job),
                    name = NULL, guide = guide_legend(order = 1, override.aes = list(size = 4))) +
  scale_edge_colour_manual(values = c("TRUE" = col_optimal, "FALSE" = col_nonopt),
                           labels = c("Non-tree", "Tree"), name = NULL,
                           guide = guide_legend(order = 2)) +
  scale_edge_width_manual(values = c("TRUE" = 1.5, "FALSE" = 0.8), guide = "none") +
  scale_edge_linetype_manual(values = c("TRUE" = "solid", "FALSE" = "dashed"), guide = "none") +
  labs(title = "Network Simplex: Flow Network",
       subtitle = "Assignment as min-cost flow: S sends 1 unit through each worker to sink T") +
  theme_graph_clean() +
  coord_fixed(clip = "off")
```

**The algorithm**:

1. **Initialize**: Find a spanning tree $T$ that supports a feasible flow (e.g., all flow through a single hub).
2. **Compute potentials**: For tree edges, set $\pi_i - \pi_j = c_{ij}$. This determines all node potentials uniquely (up to a constant).
3. **Price non-tree edges**: For each edge $(i,j) \notin T$, compute reduced cost $\bar{c}_{ij} = c_{ij} - \pi_i + \pi_j$.
4. **Pivot**: If any non-tree edge has $\bar{c}_{ij} < 0$, adding it to $T$ creates a cycle. Push flow around the cycle until some tree edge saturates. Remove that edge; the non-tree edge enters the basis.
5. **Repeat** until all reduced costs are non-negative (optimality).

**Why trees?**: In a tree, there's exactly one path between any two nodes. This means:
- Flow is uniquely determined by supplies/demands
- Potentials can be computed in $O(n)$ by tree traversal
- Each pivot changes $O(n)$ potentials (along a tree path), not $O(n^2)$

**Complexity**: $O(n^3)$ typical, strongly polynomial with anti-cycling rules.

**Best for**: Problems where you need dual variables for sensitivity analysis. Problems already formulated as network flows. Cases where you want guaranteed finite convergence.

```{r network-simplex-example}
set.seed(789)
n <- 100
cost <- matrix(runif(n * n, 0, 100), n, n)
result <- lap_solve(cost, method = "network_simplex")
cat("Total cost:", round(get_total_cost(result), 2), "\n")
```

Network Simplex is a standard tool in operations research. Not always the fastest, but reliable and provides rich dual information.

---

### Push-Relabel Algorithm

Goldberg and Tarjan's push-relabel algorithm (1988), adapted for minimum-cost flow.

**The key idea**: Traditional algorithms maintain a valid flow and search for augmenting paths. Push-relabel inverts this: maintain a **preflow** (flow that may violate conservation at intermediate nodes) and work locally to eliminate violations.

**Key concepts**:

- **Excess**: $e(v) = \text{flow in} - \text{flow out}$. Preflow allows $e(v) \geq 0$ at non-sink nodes.
- **Height function**: $h: V \to \mathbb{Z}_{\geq 0}$ with $h(t) = 0$ and $h(u) \leq h(v) + 1$ for residual edges $(u,v)$.
- **Admissible edge**: Residual edge $(u,v)$ where $h(u) = h(v) + 1$ (flow goes "downhill").

**The algorithm**:

1. **Initialize**: Saturate all edges from source. Set $h(s) = n$, $h(v) = 0$ for $v \neq s$.
2. While any node $v$ has excess $e(v) > 0$:
   - **Push**: If admissible edge $(v, w)$ exists, push $\min(e(v), \text{capacity})$ flow along it.
   - **Relabel**: If no admissible edge, set $h(v) = 1 + \min\{h(w) : (v,w) \text{ is residual}\}$.
3. When no excess remains at intermediate nodes, we have a valid maximum flow.

**For minimum-cost flow**: Modify to only push along edges with zero reduced cost, and relabel using potential updates. This gives the **cost-scaling push-relabel** variant.

**Complexity**: $O(n^2 m)$ for max-flow, $O(n^3 \log(nC))$ for min-cost flow.

**Strengths**: Highly parallelizable since pushes are local operations. Excellent cache behavior. Dominates in practice for max-flow; competitive for min-cost flow on dense graphs.

```{r push-relabel-example}
set.seed(222)
n <- 100
cost <- matrix(runif(n * n, 0, 100), n, n)
result <- lap_solve(cost, method = "push_relabel")
cat("Total cost:", round(get_total_cost(result), 2), "\n")
```

Two network perspectives. Same problem. Different algorithmic approaches.

But all these algorithms assume dense, square matrices. Real problems are messier.

---

## The Specialists

### HK01: Binary Costs

When costs are only 0 or 1, we don't need the full machinery.

**The algorithm**: Hopcroft-Karp for maximum cardinality matching, run on zero-cost edges first. Then add 1-cost edges as needed.

**Complexity**: $O(n^{2.5})$ for binary costs.

```{r hk01-example}
set.seed(101)
n <- 100
cost <- matrix(sample(0:1, n^2, replace = TRUE, prob = c(0.3, 0.7)), n, n)
result <- lap_solve(cost, method = "hk01")
cat("Total cost:", get_total_cost(result), "\n")
```

When you have binary costs and large $n$, HK01 is dramatically faster.

---

### SAP and LAPMOD: Sparse Problems

When 80% of entries are forbidden (Inf or NA), why store them?

**SAP** (Shortest Augmenting Path) and **LAPMOD** use sparse representations: adjacency lists instead of dense matrices.

**Complexity**: $O(n^2 + nm)$ where $m$ is the number of allowed edges.

```{r sap-example}
set.seed(789)
n <- 100
cost <- matrix(Inf, n, n)
edges <- sample(1:(n^2), floor(0.2 * n^2))  # Only 20% allowed
cost[edges] <- runif(length(edges), 0, 100)
result <- lap_solve(cost, method = "sap")
cat("Total cost:", round(get_total_cost(result), 2), "\n")
```

For very sparse problems, SAP can be orders of magnitude faster than dense algorithms.

---

### Ramshaw-Tarjan: Rectangular Problems (2012)

Most algorithms assume square matrices. When you have $n$ workers and $m > n$ jobs, standard approaches pad with $m - n$ dummy workers at zero cost. This works but wastes effort: the algorithm processes $m \times m$ entries when only $n \times m$ matter.

Ramshaw and Tarjan (2012) developed an algorithm that handles rectangularity natively by exploiting the structure of **unbalanced bipartite graphs**.

**The key insight**: In a rectangular assignment, we match all $n$ rows but only $n$ of the $m$ columns. The dual problem has different structure: row duals $u_i$ are unconstrained, but column duals $v_j$ must satisfy $v_j \leq 0$ for unmatched columns.

**The algorithm**:

1. Maintain dual variables $(u, v)$ with $u_i + v_j \leq c_{ij}$ and $v_j \leq 0$ for free columns.
2. Use a modified Dijkstra search that respects the asymmetric dual constraints.
3. When augmenting, update duals to preserve feasibility without padding.

**Complexity**: $O(nm \log n)$ using Fibonacci heaps, or $O(nm + n^2 \log n)$ with simpler structures.

For highly rectangular problems (e.g., matching 100 treatments to 10,000 controls), this avoids the $O(m^3)$ cost of padding to square.

```{r ramshaw-tarjan-example}
set.seed(333)
n_rows <- 30
n_cols <- 100  # Highly rectangular: 30 × 100
cost <- matrix(runif(n_rows * n_cols, 0, 100), n_rows, n_cols)
result <- lap_solve(cost, method = "ramshaw_tarjan")
cat("Matched", sum(result$assignment > 0), "of", n_rows, "rows\n")
```

When you have significantly more columns than rows (or vice versa), Ramshaw-Tarjan avoids the wasted work of padding. The newest algorithm in couplr's collection, and essential for large-scale matching problems where treatment and control pools have very different sizes.

---

## Beyond Standard Assignment

couplr includes specialized solvers for variations on the assignment problem.

### K-Best Solutions (Murty's Algorithm)

What if you want the 2nd best assignment? The 3rd best? The k-th best?

```{r murty-example}
cost <- matrix(c(10, 19, 8, 15, 10, 18, 7, 17, 13, 16, 9, 14, 12, 19, 8, 18),
               nrow = 4, byrow = TRUE)
kbest <- lap_solve_kbest(cost, k = 5)
summary(kbest)
```

**Use cases**: Sensitivity analysis. Alternative plans when the optimal is infeasible. Understanding how costs affect solutions.

### Bottleneck Assignment

Minimize the **maximum** edge cost instead of the sum.

```{r bottleneck-example}
cost <- matrix(c(5, 9, 2, 10, 3, 7, 8, 4, 6), nrow = 3, byrow = TRUE)
result <- bottleneck_assignment(cost)
cat("Bottleneck (max edge):", result$bottleneck, "\n")
```

**Use cases**: Load balancing. Fairness constraints. Worst-case optimization.

### Sinkhorn: Soft Assignment

Entropy-regularized optimal transport. Instead of hard 0/1 assignment, produce a doubly-stochastic transport plan.

```{r sinkhorn-example}
cost <- matrix(c(1, 2, 3, 4), nrow = 2)
result <- sinkhorn(cost, lambda = 10)
print(round(result$transport_plan, 3))
```

**Use cases**: Probabilistic matching. Domain adaptation. Wasserstein distances.

### Dual Variables

Extract dual prices for sensitivity analysis.

```{r duals-example}
cost <- matrix(c(10, 19, 8, 15, 10, 18, 7, 17, 13), nrow = 3, byrow = TRUE)
result <- assignment_duals(cost)
cat("Row duals (u):", result$u, "\n")
cat("Col duals (v):", result$v, "\n")
```

**Use cases**: Shadow prices. Identifying critical assignments. Marginal cost analysis.

---

## The Benchmark

You've seen what the algorithms do. Now: how fast?

```{r benchmark-plot, fig.width=9, fig.height=6, echo=FALSE, fig.alt="Runtime comparison of LAP algorithms across problem sizes showing CSA and JV leading"}
bench_results <- data.frame(
  method = factor(rep(c("Hungarian", "Jonker-Volgenant", "Auction", "CSA", "Network Simplex"), 4),
    levels = c("Hungarian", "Jonker-Volgenant", "Auction", "CSA", "Network Simplex")),
  size = rep(c(100, 200, 400, 800), each = 5),
  time = c(
    # n=100
    5, 1, 3, 2, 3,
    # n=200
    35, 4, 8, 5, 12,
    # n=400
    250, 25, 30, 20, 80,
    # n=800
    2000, 180, 200, 120, 600
  )
)

ggplot(bench_results, aes(x = size, y = time, color = method, group = method)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 3, aes(shape = method)) +
  scale_y_log10(labels = function(x) sprintf("%.0f", x)) +
  scale_x_continuous(breaks = c(100, 200, 400, 800)) +
  scale_color_manual(values = c(
    "Hungarian" = "#d9534f",
    "Jonker-Volgenant" = "#5bc0de",
    "Auction" = "#f0ad4e",
    "CSA" = "#5cb85c",
    "Network Simplex" = "#428bca"
  )) +
  labs(
    title = "Algorithm Scaling: Dense Matrices",
    subtitle = "Log scale. CSA and JV consistently fastest. Hungarian falls behind.",
    x = "Matrix Size (n × n)",
    y = "Time (milliseconds, log scale)",
    color = "Algorithm",
    shape = "Algorithm"
  ) +
  theme_minimal() +
  theme(
    legend.position = "bottom",
    plot.title = element_text(face = "bold", size = 14, colour = "#3E3F3A"),
    plot.subtitle = element_text(size = 11, colour = "#3E3F3A"),
    panel.grid.minor = element_blank()
  )
```

**For dense matrices**: CSA and JV are consistently fastest. Hungarian falls behind rapidly. Auction and Network Simplex are solid middle-ground choices.

```{r sparse-plot, fig.width=8, fig.height=4, echo=FALSE, fig.alt="Sparse algorithm performance showing SAP and LAPMOD outperforming dense algorithms"}
sparse_results <- data.frame(
  method = factor(rep(c("JV (dense)", "SAP (sparse)", "LAPMOD (sparse)"), 3),
    levels = c("JV (dense)", "SAP (sparse)", "LAPMOD (sparse)")),
  size = rep(c(100, 200, 400), each = 3),
  time = c(
    # n=100
    3, 0.8, 0.7,
    # n=200
    15, 2, 1.8,
    # n=400
    100, 8, 7
  )
)

ggplot(sparse_results, aes(x = size, y = time, color = method, group = method)) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 3) +
  scale_x_continuous(breaks = c(100, 200, 400)) +
  labs(
    title = "Sparse vs Dense: 80% Forbidden Entries",
    subtitle = "Sparse algorithms (SAP, LAPMOD) dramatically outperform dense (JV)",
    x = "Matrix Size (n × n)",
    y = "Time (milliseconds)",
    color = "Algorithm"
  ) +
  scale_color_manual(values = c("#5bc0de", "#5cb85c", "#f0ad4e")) +
  theme_minimal() +
  theme(
    legend.position = "bottom",
    plot.title = element_text(face = "bold", size = 14, colour = "#3E3F3A"),
    plot.subtitle = element_text(size = 11, colour = "#3E3F3A")
  )
```

**For sparse matrices**: SAP and LAPMOD are 10× faster than dense algorithms. Use them.

---

## Quick Reference

| Algorithm | Complexity | Best For | Method |
|-----------|------------|----------|--------|
| Hungarian | $O(n^3)$ | Pedagogy, small $n$ | `"hungarian"` |
| Jonker-Volgenant | $O(n^3)$ expected | General purpose | `"jv"` |
| Auction | $O(n^2 \log(nC)/\epsilon)$ | Large dense | `"auction"` |
| Auction (Gauss-Seidel) | $O(n^2 \log(nC)/\epsilon)$ | Spatial structure | `"auction_gs"` |
| Auction (Scaled) | $O(n^2 \log(nC)/\epsilon)$ | Large dense, fastest | `"auction_scaled"` |
| CSA | $O(n^3)$ amortized | Medium-large dense | `"csa"` |
| Cost-Scaling Flow | $O(n^3 \log(nC))$ | General min-cost flow | `"csflow"` |
| Gabow-Tarjan | $O(n^3 \log C)$ | Large integer costs | `"gabow_tarjan"` |
| Orlin-Ahuja | $O(\sqrt{n} m \log(nC))$ | Large sparse | `"orlin"` |
| Network Simplex | $O(n^3)$ typical | Dual info needed | `"network_simplex"` |
| Push-Relabel | $O(n^2 m)$ | Max-flow style | `"push_relabel"` |
| Cycle Canceling | $O(n^2 m \cdot C)$ | Theoretical interest | `"cycle_cancel"` |
| HK01 | $O(n^{2.5})$ | Binary costs only | `"hk01"` |
| SSAP (Dial) | $O(n^2 + nm)$ | Integer costs, buckets | `"ssap_bucket"` |
| SAP | $O(n^2 + nm)$ | Sparse (>50% forbidden) | `"sap"` |
| LAPMOD | $O(n^2 + nm)$ | Sparse (>50% forbidden) | `"lapmod"` |
| Ramshaw-Tarjan | $O(nm \log n)$ | Rectangular matrices | `"ramshaw_tarjan"` |
| Brute Force | $O(n!)$ | Tiny ($n \leq 8$) | `"bruteforce"` |

Or just use `method = "auto"` and let couplr choose.

---

## References

- Kuhn, H. W. (1955). The Hungarian method for the assignment problem. *Naval Research Logistics Quarterly*.
- Jonker, R., & Volgenant, A. (1987). A shortest augmenting path algorithm for dense and sparse linear assignment problems. *Computing*.
- Bertsekas, D. P. (1988). The auction algorithm: A distributed relaxation method. *Annals of Operations Research*.
- Gabow, H. N., & Tarjan, R. E. (1989). Faster scaling algorithms for network problems. *SIAM Journal on Computing*.
- Goldberg, A. V., & Kennedy, R. (1995). An efficient cost scaling algorithm for the assignment problem. *Mathematical Programming*.
- Orlin, J. B., & Ahuja, R. K. (1992). New scaling algorithms for the assignment and minimum mean cycle problems. *Mathematical Programming*.
- Ramshaw, L., & Tarjan, R. E. (2012). On minimum-cost assignments in unbalanced bipartite graphs. *HP Labs Technical Report*.
- Goldberg, A. V., & Tarjan, R. E. (1988). A new approach to the maximum-flow problem. *Journal of the ACM*.
- Murty, K. G. (1968). An algorithm for ranking all assignments in order of increasing cost. *Operations Research*.
- Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. *NeurIPS*.
- Burkard, R., Dell'Amico, M., & Martello, S. (2009). *Assignment Problems*. SIAM.

---