Sphere visualization
The 3D sphere displays 31,102 vertices (one per verse) and
— cross-reference edges.
Vertices: ◆ Old Testament (39 books) · ✚ New Testament (27 books). Color encodes canonical book order
along a gradient from blue (Genesis) through red to gold (Revelation).
Edges: Intra-testament (OT↔OT, NT↔NT) — — ·
Inter-testament (OT↔NT) — —.
Cross-references sourced from openbible.info.
Sentence embedding
Let \(\mathcal{V} = \{v_1, \ldots, v_{31102}\}\) denote the KJV verse corpus.
Each verse is encoded via a sentence transformer fine-tuned specifically on biblical text
(odunola/sentence-transformers-bible-reference-final,
109M parameters, 12 transformer layers, MPNet architecture).
The model was contrastively trained on 929K biblical sentence pairs using cosine similarity loss,
learning to place theologically related passages close together in the embedding space:
$$\phi : \mathcal{V} \to \mathbb{R}^{768}, \qquad \phi(v_i) = \mathrm{MeanPool}\!\bigl(\mathrm{MPNet}(\mathrm{tokenize}(v_i))\bigr)$$
Unlike general-purpose encoders trained on web text, this model's latent space
separates verses by biblical meaning — prophecy, typology, thematic parallel,
and doctrinal correspondence — rather than surface vocabulary overlap.
Dimensionality reduction
The embedding matrix \(\Phi \in \mathbb{R}^{31102 \times 768}\) is reduced to
3 dimensions via UMAP, which minimizes the fuzzy set cross-entropy between
high-dimensional affinities \(w_{ij}\) and low-dimensional affinities \(q_{ij}\):
$$\mathcal{L}_{\text{UMAP}} = \sum_{i \neq j} \left[ w_{ij} \ln \frac{w_{ij}}{q_{ij}} + (1 - w_{ij}) \ln \frac{1 - w_{ij}}{1 - q_{ij}} \right]$$
where \(w_{ij} = \exp\!\bigl(-({d(\phi_i, \phi_j) - \rho_i})/{\sigma_i}\bigr)\)
captures local metric structure in the ambient space, with
\(\texttt{n\_neighbors}=15\), \(\texttt{min\_dist}=0.1\), \(\texttt{metric}=\text{cosine}\).
Spherical projection
The 3D UMAP coordinates \(\mathbf{x}_i \in \mathbb{R}^3\) are centered and
normalized onto the unit 2-sphere:
$$\hat{\mathbf{x}}_i = \frac{\mathbf{x}_i - \bar{\mathbf{x}}}{\|\mathbf{x}_i - \bar{\mathbf{x}}\|_2} \;\in\; S^2$$
This preserves angular relationships: semantically similar verses cluster
into regions on the sphere, while distant passages occupy opposite hemispheres.
Cross-references
Arcs represent scholarly cross-references from the
openbible.info
dataset (Christoph Römhild, public domain). Each entry connects two
verses identified by biblical scholars as linked via quotation, allusion,
typological parallel, or prophecy–fulfillment, weighted by community
vote count \(n_{\text{votes}}\). The homepage slider filters arcs by
\(n_{\text{votes}} \geq \tau\), where \(\tau\) is user-selected.
Arc geometry
For two sphere points \(\hat{\mathbf{x}}_i, \hat{\mathbf{x}}_j \in S^2\),
the arc midpoint is computed via normalized averaging and bowed outward
by a height proportional to angular separation:
$$\mathbf{m}_{ij} = \frac{\hat{\mathbf{x}}_i + \hat{\mathbf{x}}_j}{\|\hat{\mathbf{x}}_i + \hat{\mathbf{x}}_j\|} \cdot (1 + h), \qquad h = h_{\min} + (h_{\max} - h_{\min}) \cdot \frac{\theta_{ij}}{\pi}$$
where \(\theta_{ij} = \arccos(\hat{\mathbf{x}}_i \cdot \hat{\mathbf{x}}_j)\) is the
geodesic angular distance. This ensures short-range references hug the surface
while long-range references arc prominently outward.