Algorithm Details¶
This page describes the mathematical foundations and algorithmic steps
implemented in orc_bound. Refer to paper [1] for details.
[1] Xiang Gu, Huichun Zhang, Jian Sun, A Residual-Shell-Based Lower Bound for Ollivier-Ricci Curvature. https://arxiv.org/abs/2604.12211.
Mathematical Background¶
Ollivier–Ricci Curvature (ORC) of an edge \((u, v)\) is defined as:
where:
\(\mu_u\) is a probability measure at node \(u\)
\(d(u, v)\) is the shortest-path distance
\(W_1\) is the 1-Wasserstein (earth mover) distance between the two measures
Computing \(W_1\) exactly is intractable for large graphs. This package computes a tight upper bound on \(W_1\), which yields a lower bound on the curvature.
Step 1 — k-Hop Lazy Random Walk Measure¶
For each node \(x\), the k-hop lazy random walk measure is:
where:
\(\alpha \in [0, 1]\) is the lazy parameter
\(\delta_x\) is the Dirac delta at \(x\)
\(P\) is the row-stochastic transition matrix: \(P[x, y] = 1 / \deg(x)\) if \((x, y)\) is an edge
\(P^k\) is the k-th matrix power, giving k-step transition probabilities
The support of \(\mu_x^{(k)}\) is exactly the k-hop neighborhood of \(x\).
Implementation: orc_bound.build_lazy_measures_k()
Step 2 — Truncated All-Pairs Shortest Paths¶
The maximum distance between any two points in \(\mathrm{supp}(\mu_u^{(k)})\) and \(\mathrm{supp}(\mu_v^{(k)})\) is bounded by:
Hence we precompute the distance matrix only up to cutoff \(2k + 1\):
This avoids the \(O(|V|^3)\) cost of full APSP.
Implementation: orc_bound.all_pairs_shortest_path_matrix_cutoff()
Step 3 — Residual-Shell Upper Bound on \(W_1\)¶
For an edge \((u, v)\), let \(\mu = \mu_u^{(k)}\) and \(\nu = \mu_v^{(k)}\). We compute an upper bound \(\bar{W}_1(\mu, \nu)\) as follows.
Bucket construction. Group all pairs \((x, y)\) with \(x \in \mathrm{supp}(\mu)\), \(y \in \mathrm{supp}(\nu)\) into \(l + 1\) buckets by distance:
Greedy matching. Initialize \(a[x] = \mu[x]\), \(b[y] = \nu[y]\). For each bucket \(r\) from 0 to \(l\):
Residual shell. After all buckets are processed:
If \(R_l > 0\), compute the residual distance:
Upper bound:
Implementation: orc_bound.residual_shell_upper_bound()
Step 4 — Curvature Lower Bound¶
The ORC lower bound for edge \((u, v)\) is:
Note that \(d(u, v) = 1\) for all edges in an unweighted graph.
Implementation: orc_bound.residual_shell_ricci_approximation()
Full Algorithm Pseudocode¶
Input: Graph G = (V, E), integer k ≥ 1, lazy parameter α ∈ [0, 1],
shell radius l ≥ 0, tolerance tol > 0, n_jobs workers
Output: Sparse matrix C of ORC lower bounds
# ── 1. Build k-hop measures ─────────────────────────────────────
P ← row-stochastic transition matrix of G
P^k ← P to the power k # matrix power
for each node x ∈ V do
for each node y ∈ V do
μ_x[y] ← α · [x = y] + (1 - α) · P^k[x, y]
# ── 2. Truncated APSP ────────────────────────────────────────────
cutoff ← 2k + 1
D ← ∞ matrix of size |V| × |V|
for each node x ∈ V do
for each node y at distance ≤ cutoff from x do
D[x, y] ← dist(x, y)
# ── 3. Parallel edge curvature ───────────────────────────────────
parallel for each (u, v) ∈ E with n_jobs workers do
a ← μ_u (copy, prune by tol)
b ← μ_v (copy, prune by tol)
for r ← 0 to l do
Bucket[r] ← []
for x ∈ keys(a) do
for y ∈ keys(b) do
d ← D[x, y]
if d ≤ l and d < ∞ then
Bucket[int(d)].append((x, y))
for r ← 0 to l do
for (x, y) ∈ Bucket[r] do
δ ← min(a[x], b[y])
if δ > tol then
a[x] ← a[x] - δ
b[y] ← b[y] - δ
m_r ← m_r + δ
R_l ← Σ_x a[x]
if R_l > tol then
RU ← {x | a[x] > tol}
RV ← {y | b[y] > tol}
r̄ ← max{ D[x, y] | x ∈ RU, y ∈ RV }
else
r̄ ← 0
W̄ ← Σ_{r=0}^{l} r · m_r + r̄ · R_l
C[u, v] ← 1 - W̄ / D[u, v]
return C
Complexity Analysis¶
Step |
Time |
Space |
|---|---|---|
Build \(P^k\) |
\(O(|V|^3 \log k)\) via matrix power |
\(O(|V|^2)\) |
Truncated APSP |
\(O(|V| \cdot (2k+1) \cdot \bar{d})\) |
\(O(|V|^2)\) |
Per-edge W1 bound |
\(O(|\mathrm{supp}(\mu)| \cdot |\mathrm{supp}(\nu)|)\) — parallelized over edges |
\(O(|V|^2)\) total |
Total (parallel) |
\(O(|V| \cdot k \cdot \bar{d} + |E| \cdot d^2 \cdot k_{\max}) / n_{\text{jobs}}\) |
\(O(|V|^2)\) |
where \(\bar{d}\) is the average node degree.
Parameter Guide¶
Parameter |
Default |
Effect |
|---|---|---|
|
1 |
Number of random walk steps. Larger k spreads mass further (more global measure). Affects both the measure support size and the distance cutoff. |
|
0.0 |
Lazy mixing parameter in [0, 1]. Higher values keep more mass at the source node. |
|
3 |
Maximum shell distance for bucket matching. |
|
|
How to estimate the residual distance. |
|
None |
Number of parallel workers. |