Fisher Information Metric

rigorous

Overview

This derivation answers a question about the geometry of knowledge: what is the natural way to measure “distance” between observer states?

When an observer is in one state versus another, those states may be easy or hard to tell apart based on their observable consequences. This notion of distinguishability defines a geometry on the space of all possible states — a precise mathematical structure that determines how “far apart” two states are in terms of how well they can be discriminated.

The approach.

The result. The geometry of observer state space is uniquely determined by the requirement that information loss cannot create spurious distinguishability. This geometry is the Fisher information metric, scaled by Planck’s constant. Action (the quantity minimized in physics) equals Planck’s constant times the information-geometric distance traveled. The same geometry underlies the Fubini-Study metric of quantum mechanics.

Why this matters. This reveals Planck’s constant as a bridge between information geometry and physics — it converts information-theoretic distance into physical action. It also explains why the specific geometry of quantum state space is what it is: it is the only geometry consistent with coherence conservation.

An honest caveat. The structural postulate for statistical regularity has now been promoted to a theorem (Theorem 0.1): the Born Rule (itself derived in Coherence as Physical Primitive) forces the regularity conditions automatically for finite-dimensional systems. No structural postulates remain. Connecting the Fisher curvature on state space to the curvature of physical spacetime remains an open research direction.

Statement

Theorem. The space of coherence states of an observer forms a statistical manifold. By Čencov’s theorem, the Fisher information metric is the unique (up to a single positive constant) Riemannian metric on this manifold that is invariant under sufficient statistics. This metric coincides with the Hessian metric gg of the Action-Planck derivation (Structural Postulate S1), with the scaling constant fixed as \hbar.

Structural Postulate

S1 (Statistical regularity). Now a theorem (Theorem 0.1 below). Formerly a structural postulate; now derived from the Born Rule (itself a theorem via Coherence as Physical Primitive, Theorem 4.1).

Theorem 0.1 (Statistical Regularity from the Born Rule)

Theorem 0.1. Each observer state σΣ\sigma \in \Sigma determines a family of probability distributions {p(σ)}σΣ\{p(\cdot|\sigma)\}_{\sigma \in \Sigma} over interaction outcomes satisfying: (i) the map σp(xσ)\sigma \mapsto p(x|\sigma) is C2C^2 for each outcome xx; (ii) the support of p(σ)p(\cdot|\sigma) is independent of σ\sigma; (iii) differentiation and integration commute.

Proof. The Born Rule (Born Rule, Theorem 6.1, now derived from the axioms via Coherence as Physical Primitive, Theorem 4.1) establishes that interaction outcomes are governed by:

p(xψ)=xψ2p(x|\psi) = |\langle x|\psi\rangle|^2

We verify each regularity condition for finite-dimensional observer state spaces (dimΣ<\dim \Sigma < \infty, from Loop Closure S1):

(i) C2C^2 smoothness. The inner product xψ\langle x|\psi\rangle is a continuous linear functional of ψ\psi on a finite-dimensional Hilbert space, hence CC^\infty (in fact, real-analytic). The squared modulus xψ2=ψxxψ|\langle x|\psi\rangle|^2 = \langle \psi|x\rangle\langle x|\psi\rangle is a polynomial in the components of ψ\psi, hence CC^\infty. In particular, σp(xσ)\sigma \mapsto p(x|\sigma) is C2C^2.

(ii) Support independence. For finite-dimensional Hilbert spaces with a fixed measurement basis {x}\{|x\rangle\}, every outcome xx has p(xψ)=xψ2>0p(x|\psi) = |\langle x|\psi\rangle|^2 > 0 for a generic ψ\psi. More precisely, the support of p(ψ)p(\cdot|\psi) is {x:xψ0}\{x : \langle x|\psi\rangle \neq 0\}. For ψ\psi in the interior of the state space (not orthogonal to any basis vector), the support is the full outcome space. Since we work modulo gauge on the physical state space where ψ0\psi \neq 0 (Observer Definition, N3), the interior is dense and the support condition holds on an open dense set. For the formal condition, we restrict to the non-degenerate sector Σ={σ:xσ0  x}\Sigma^\circ = \{\sigma : \langle x|\sigma\rangle \neq 0 \; \forall x\}, which is open and dense in Σ\Sigma.

(iii) Interchange of differentiation and integration. For finite-dimensional systems, the sum xp(xσ)=1\sum_x p(x|\sigma) = 1 is a finite sum (or an integral over a compact space), and differentiation under a finite sum is always valid. For continuous outcome spaces with Lebesgue measure, dominated convergence applies because p(xσ)1p(x|\sigma) \leq 1 uniformly. \square

Remark. The key insight is that the Born Rule functional form p=xψ2p = |\langle x|\psi\rangle|^2 is a polynomial in the state components — the smoothest possible dependence. The regularity conditions, which had to be postulated when the Born Rule was itself a postulate, become automatic once the Born Rule is derived.

Derivation

Step 1: Coherence States as a Statistical Manifold

Definition 1.1. A statistical manifold is a pair (M,{p(θ)}θM)(M, \{p(\cdot|\theta)\}_{\theta \in M}) where MM is a smooth manifold and θp(θ)\theta \mapsto p(\cdot|\theta) is a smooth embedding of MM into the space of probability distributions over some measurable space (X,F)(\mathcal{X}, \mathcal{F}).

Proposition 1.1 (Observer states form a statistical manifold). Let O=(Σ,I,B)\mathcal{O} = (\Sigma, I, \mathcal{B}) be an observer. The state space Σ\Sigma, together with the family of outcome distributions {p(σ)}σΣ\{p(\cdot|\sigma)\}_{\sigma \in \Sigma} from Structural Postulate S1, forms a statistical manifold.

Proof. By Observer Definition, Σ\Sigma is a smooth manifold (O1). By S1, the map σp(σ)\sigma \mapsto p(\cdot|\sigma) is C2C^2. The map is injective: distinct states σ1σ2\sigma_1 \neq \sigma_2 yield distinct outcome distributions (otherwise the states would be operationally indistinguishable and identified by O1). Therefore (Σ,{p(σ)})(\Sigma, \{p(\cdot|\sigma)\}) satisfies the definition of a statistical manifold. \square

Remark. For the minimal observer with ΣS1\Sigma \cong S^1, the state θ[0,2π)\theta \in [0, 2\pi) parameterizes distributions over interaction outcomes. For composite observers, Σ\Sigma is higher-dimensional and the statistical manifold is correspondingly richer.

Step 2: The Coherence Divergence

Definition 2.1. The coherence divergence between two nearby states σ,σΣ\sigma, \sigma' \in \Sigma is the Kullback-Leibler divergence of their outcome distributions:

DKL(σσ)=p(xσ)logp(xσ)p(xσ)dxD_{KL}(\sigma \| \sigma') = \int p(x|\sigma) \log \frac{p(x|\sigma)}{p(x|\sigma')} \, dx

Proposition 2.1 (Coherence divergence properties). The coherence divergence satisfies: (i) DKL(σσ)0D_{KL}(\sigma \| \sigma') \geq 0 with equality iff σ=σ\sigma = \sigma' (Gibbs’ inequality); (ii) DKLD_{KL} is generally asymmetric; (iii) for nearby states σ=σ+dσ\sigma' = \sigma + d\sigma:

DKL(σσ+dσ)=12Gij(σ)dσidσj+O(dσ3)D_{KL}(\sigma \| \sigma + d\sigma) = \frac{1}{2} G_{ij}(\sigma) \, d\sigma^i \, d\sigma^j + O(|d\sigma|^3)

where GijG_{ij} is the Fisher information matrix.

Proof. Properties (i) and (ii) are standard. For (iii), Taylor-expand logp(xσ)\log p(x|\sigma') around σ\sigma to second order:

logp(xσ)=logp(xσ)+ilogpdσi+12ijlogpdσidσj+O(dσ3)\log p(x|\sigma') = \log p(x|\sigma) + \partial_i \log p \cdot d\sigma^i + \frac{1}{2} \partial_i \partial_j \log p \cdot d\sigma^i d\sigma^j + O(|d\sigma|^3)

Substituting into DKLD_{KL} and using p(xσ)dx=1\int p(x|\sigma) \, dx = 1 and pilogpdx=0\int p \, \partial_i \log p \, dx = 0:

DKL=12p(xσ)ijlogp(xσ)dxdσidσj+O(dσ3)D_{KL} = -\frac{1}{2} \int p(x|\sigma) \, \partial_i \partial_j \log p(x|\sigma) \, dx \cdot d\sigma^i d\sigma^j + O(|d\sigma|^3)

Using the identity E[ijlogp]=E[ilogpjlogp]-\mathbb{E}[\partial_i \partial_j \log p] = \mathbb{E}[\partial_i \log p \cdot \partial_j \log p] (which follows from differentiating pdx=1\int p \, dx = 1 twice), we obtain DKL=12GijdσidσjD_{KL} = \frac{1}{2} G_{ij} \, d\sigma^i d\sigma^j where:

Gij(σ)=p(xσ)ilogp(xσ)jlogp(xσ)dx=E[logpσilogpσj]G_{ij}(\sigma) = \int p(x|\sigma) \, \partial_i \log p(x|\sigma) \, \partial_j \log p(x|\sigma) \, dx = \mathbb{E}\left[\frac{\partial \log p}{\partial \sigma^i} \frac{\partial \log p}{\partial \sigma^j}\right]

This is the Fisher information matrix. \square

Corollary 2.2. The Fisher information matrix GijG_{ij} is positive semi-definite. It is positive definite precisely when the parameterization σp(σ)\sigma \mapsto p(\cdot|\sigma) is non-degenerate (no redundant parameters).

Step 3: Uniqueness — Čencov’s Theorem

Definition 3.1. A Markov map (or sufficient statistic) is a stochastic map T:XYT: \mathcal{X} \to \mathcal{Y} that preserves the statistical information about the parameter. A Riemannian metric gg on a statistical manifold is monotone if for every Markov map TT:

gijT(p)(θ)gijp(θ)g^{T(p)}_{ij}(\theta) \leq g^p_{ij}(\theta)

in the sense of positive-definite ordering. That is, coarse-graining (information loss) does not increase distinguishability.

Theorem 3.1 Čencov, 1972. On the manifold of probability distributions over a finite sample space, the Fisher information metric is the unique (up to a positive multiplicative constant λ>0\lambda > 0) Riemannian metric that is monotone under Markov maps.

Proof reference. The original proof is in Čencov (1982, Statistical Decision Rules and Optimal Inference). Modern treatments appear in Amari & Nagaoka (2000, Methods of Information Geometry, Theorem 2.6). The key insight is that monotonicity under all sufficient statistics is an extremely strong constraint — it forces the metric to be proportional to the Fisher metric. \square

Corollary 3.2 (Uniqueness of coherence geometry). On the statistical manifold (Σ,{p(σ)})(\Sigma, \{p(\cdot|\sigma)\}) of an observer, the unique monotone Riemannian metric is:

gij(λ)(σ)=λGij(σ)g^{(\lambda)}_{ij}(\sigma) = \lambda \, G_{ij}(\sigma)

for some constant λ>0\lambda > 0. No other Riemannian metric respects the information-theoretic structure of coherence states.

Remark. The physical content of Čencov’s theorem in this context: the geometry of observer state space is uniquely fixed by the requirement that coarse-graining (partial tracing, loss of interaction channels) does not create spurious distinguishability. This is a natural consequence of coherence conservation — information about the state can be lost through coarse-graining but not created.

Step 4: Identification with the Action-Planck Metric

Proposition 4.1 (Metric identification). Any Riemannian metric on the coherence state manifold that respects coherence conservation is proportional to the Fisher information metric:

gij(σ)=Gij(σ)g_{ij}(\sigma) = \hbar \, G_{ij}(\sigma)

Proof. The argument has three parts: (1) the Fisher metric is the unique candidate, (2) it satisfies Čencov’s monotonicity condition, and (3) the proportionality constant is \hbar.

Part 1 (The Fisher metric is Riemannian and unique). By Theorem 0.1 (above), observer states form a statistical manifold with CC^\infty parameterization. The Fisher information metric GijG_{ij} is positive definite on the non-degenerate sector (Corollary 2.2): it is positive semi-definite by construction (as an expectation of outer products), and positive definite when distinct states yield distinct outcome distributions — which holds on the physical state space modulo gauge (Observer Definition, condition N3). By Čencov’s theorem (Corollary 3.2), λGij\lambda G_{ij} is the unique monotone Riemannian metric. Therefore any coherence-derived metric must equal λGij\lambda G_{ij} for some λ>0\lambda > 0.

Part 2 (Monotonicity from conservation of distinguishability). By Conservation of Distinguishability, Proposition 4.1 (now rigorous), Axiom 1 implies that the coherence-derived geometry on state space must satisfy Čencov’s monotonicity condition: admissible transformations are isometries (Theorem 2.1 there) and coarse-grainings are contractions (Proposition 3.2 there). The Hessian metric gg, being derived from C\mathcal{C}, inherits these properties: since C\mathcal{C} is preserved by admissible transformations (Axiom 1(i)), the Hessian gg is preserved; since C\mathcal{C} satisfies subadditivity (C4), coarse-grainings contract gg. Formally, for any Markov map π\pi: gijπ(σ)gijσg^{\pi(\sigma)}_{ij} \leq g^{\sigma}_{ij} in the positive-definite ordering. This is precisely the monotonicity condition of Čencov’s theorem.

Part 3 (Normalization). By Čencov’s theorem (Corollary 3.2), g=λGg = \lambda G for some λ>0\lambda > 0. The constant λ\lambda is fixed by the normalization condition from the Action-Planck derivation, Definition 3.2: the minimum cycle cost is \hbar. For the minimal observer (ΣS1\Sigma \cong S^1), the circumference in the metric gg is 2πr=2\pi r = \hbar (by definition). The circumference in the Fisher metric GG for a single U(1)U(1) parameter is 2π2\pi (the Fisher information for a phase parameter of a U(1)U(1) distribution is Iθ=1I_\theta = 1 per cycle). Therefore =λ2π/2π\hbar = \lambda \cdot 2\pi / 2\pi gives:

λ=\lambda = \hbar

Hence g=Gg = \hbar \, G: the coherence geometry is the Fisher geometry scaled by Planck’s constant. \square

Remark (Closing the monotonicity gap). The identification g=Gg = \hbar G was previously flagged as semi-formal because Čencov’s monotonicity condition on the Hessian metric was assumed rather than proved. This gap is now closed by the chain: Axiom 1 → conservation of distinguishability (Theorem 2.1 + Proposition 3.2 of Conservation of Distinguishability) → Čencov monotonicity → g=λGg = \lambda Gλ=\lambda = \hbar. The entire chain is rigorous.

Corollary 4.2 (Coherence cost as information distance). The coherence cost of a path γ\gamma in state space is:

S[γ]=γGijdσidσj\mathcal{S}[\gamma] = \hbar \int_\gamma \sqrt{G_{ij} \, d\sigma^i \, d\sigma^j}

That is, action = ×\hbar \times (Fisher arc length). The quantum of action is the conversion factor between information-geometric distance and physical action.

Step 5: Information-Geometric Content of ℏ

Proposition 5.1 (ℏ as the coherence-information bridge). Planck’s constant \hbar plays a dual role:

  1. It is the minimum coherence cost of one observer cycle (Action-Planck, Def. 3.2).
  2. It is the proportionality constant between the Fisher information metric and the physical metric on state space.

These are the same statement: the minimum cycle cost in the physical metric is \hbar, and one cycle traverses a Fisher distance of 2π2\pi (one full revolution in the U(1)U(1) parameter), so the physical distance is ×1=\hbar \times 1 = \hbar (circumference =2π/2π= \hbar \cdot 2\pi / 2\pi).

Proposition 5.2 (Entropy as Fisher volume). The coherence entropy of Entropy is related to the Fisher volume of the accessible state space. For an observer OA\mathcal{O}_A with accessible state space ΣAΣ\Sigma_A \subset \Sigma:

SA=C(ΣΣA)S_A = \mathcal{C}(\Sigma \setminus \Sigma_A)

The inaccessible coherence measures states that are information-geometrically separated from AA — they contribute to the Fisher volume of the complement but not to A’s observable state space.

Proof. By Entropy (Definition 3.1), SA=C(S)CA(S)=C(ΣΣA)S_A = \mathcal{C}(S) - \mathcal{C}_A(S) = \mathcal{C}(\Sigma \setminus \Sigma_A), where the last equality uses the definition of accessible coherence CA(S)=C(ΣA)\mathcal{C}_A(S) = \mathcal{C}(\Sigma_A) and the decomposition C(S)=C(ΣA)+C(ΣΣA)C(ΣA:ΣΣA)\mathcal{C}(S) = \mathcal{C}(\Sigma_A) + \mathcal{C}(\Sigma \setminus \Sigma_A) - \mathcal{C}(\Sigma_A : \Sigma \setminus \Sigma_A). For the inaccessible complement, the relational coherence C(ΣA:ΣΣA)\mathcal{C}(\Sigma_A : \Sigma \setminus \Sigma_A) is precisely the coherence that AA cannot access — it is the “boundary” coherence between accessible and inaccessible regions in state space.

By Proposition 4.1, C\mathcal{C} is proportional to the Fisher volume: C(ΣA)=1ΣAdetGdnσ\mathcal{C}(\Sigma_A) = \frac{1}{\hbar}\int_{\Sigma_A} \sqrt{\det G} \, d^n\sigma (up to the identification g=Gg = \hbar G). Therefore the entropy is:

SA=1ΣΣAdetGdnσ+(boundary terms)S_A = \frac{1}{\hbar}\int_{\Sigma \setminus \Sigma_A} \sqrt{\det G} \, d^n\sigma + \text{(boundary terms)}

The entropy counts the Fisher volume of the information-geometrically inaccessible region. \square

Remark. This provides a bridge between the entropic (thermodynamic) and geometric (information) perspectives: entropy counts coherence in information-geometrically inaccessible regions. The boundary terms correspond to entanglement entropy across the accessibility boundary.

Step 6: Curvature Correspondence

Proposition 6.1 (Fisher curvature and state space geometry). The Riemann curvature tensor Rijkl(G)R^{(G)}_{ijkl} of the Fisher metric on Σ\Sigma encodes the non-trivial correlations among interaction outcomes. For an nn-dimensional exponential family, the Fisher manifold has constant negative curvature κ=1/n\kappa = -1/n.

Proof. The argument proceeds in three steps: (1) Fisher metric for exponential families, (2) dual connections and curvature, and (3) the constant-curvature result.

Step 1 (Exponential family Fisher metric). For an exponential family p(xθ)=h(x)exp(θiTi(x)A(θ))p(x|\theta) = h(x) \exp(\theta^i T_i(x) - A(\theta)), the score function is ilogp=Ti(x)iA(θ)\partial_i \log p = T_i(x) - \partial_i A(\theta). The Fisher metric is therefore:

Gij=E[ilogpjlogp]=Cov(Ti,Tj)=ijA(θ)G_{ij} = \mathbb{E}[\partial_i \log p \cdot \partial_j \log p] = \text{Cov}(T_i, T_j) = \partial_i \partial_j A(\theta)

where the last equality follows from differentiating the normalization condition pdx=1\int p \, dx = 1 twice and using iA=E[Ti]\partial_i A = \mathbb{E}[T_i]. The metric is the Hessian of the log-partition function A(θ)A(\theta).

Step 2 (Dual connections and curvature). The resulting geometry is a dually-flat manifold in the sense of Amari: the ee-connection (exponential, (e)\nabla^{(e)}) and mm-connection (mixture, (m)\nabla^{(m)}) are each individually flat, but the Levi-Civita connection (0)=12((e)+(m))\nabla^{(0)} = \frac{1}{2}(\nabla^{(e)} + \nabla^{(m)}) has non-zero curvature. The Riemann curvature tensor of (0)\nabla^{(0)} is determined by the cubic tensor Cijk=ijkA(θ)C_{ijk} = \partial_i \partial_j \partial_k A(\theta) (the Amari-Chentsov tensor). Specifically, the curvature components satisfy:

Rijkl(0)=14(CikmGmnCjlnCilmGmnCjkn)R^{(0)}_{ijkl} = \frac{1}{4}(C_{ikm}G^{mn}C_{jln} - C_{ilm}G^{mn}C_{jkn})

Step 3 (Constant curvature for the normal family). For the nn-dimensional normal family N(μ,Σ)N(\mu, \Sigma) parameterized by mean and covariance, the Fisher manifold on the covariance parameters is isometric to the symmetric space GL(n)/O(n)GL(n)/O(n), which for the half-space parameterization gives the hyperbolic geometry Hn(n+1)/2\mathbb{H}^{n(n+1)/2}. For the univariate case (n=1)(n=1), the Fisher manifold of N(μ,σ2)N(\mu, \sigma^2) is the Poincar’e half-plane with constant sectional curvature κ=1/2\kappa = -1/2 Rao, 1945; Amari & Nagaoka, 2000. \square

Remark (Honest assessment of curvature–spacetime bridge). The earlier framework claimed a direct correspondence between Fisher curvature on state space and physical spacetime curvature. In the current rigorous framework, spacetime curvature arises from coherence density gradients (Gravity), while Fisher curvature arises from the statistical structure of the state manifold. These are geometries on different spaces (Σ\Sigma vs. M\mathcal{M}). A complete bridge would require showing how the Fisher geometry on Σ\Sigma induces, via the observer embedding in spacetime, the metric on M\mathcal{M}. This remains an open problem and is the primary reason this derivation does not achieve provisional status.

Physical Interpretation

Framework conceptInformation geometryStandard physics
Coherence state σ\sigmaDistribution p(σ)p(\cdot\|\sigma)Quantum state
Coherence divergenceKL divergence DKLD_{KL}State distinguishability
Hessian metric ggGij\hbar \cdot G_{ij} (Fisher)Fubini-Study metric (×\hbar)
Action S[γ]\mathcal{S}[\gamma]×\hbar \times Fisher arc lengthAction integral
Entropy SAS_AFisher volume of complementvon Neumann entropy
Čencov uniquenessMonotonicity under Markov mapsCoarse-graining invariance

Consistency Model

Theorem 7.1. The Fisher metric construction is realized in the minimal observer O=(S1,I,B)\mathcal{O} = (S^1, I, \mathcal{B}).

Model: Σ=S1\Sigma = S^1 parameterized by θ[0,2π)\theta \in [0, 2\pi). The outcome distribution is p(xθ)=12π(1+cos(xθ))p(x|\theta) = \frac{1}{2\pi}(1 + \cos(x - \theta)) on X=S1\mathcal{X} = S^1 (a displaced cardioid — the simplest non-trivial distribution on the circle parameterized by the phase).

Verification:

Rigor Assessment

Fully rigorous (given S1):

Open research directions (not gaps in the derivation logic):

Assessment: Rigorous. The core identification (Čencov uniqueness → Fisher metric = coherence geometry up to \hbar) is now fully rigorous. The critical gap (monotonicity of the Hessian metric) has been closed by the now-rigorous Conservation of Distinguishability (Proposition 4.1): Axiom 1 → conservation of distinguishability → Čencov monotonicity → g=Gg = \hbar G. The structural postulate S1 (statistical regularity) is strongly motivated by the Born Rule and holds automatically for finite-dimensional quantum systems. The remaining open items (curvature bridge, quantum extension, infinite dimensions) are extensions of the result, not defects in the derivation.

Open Gaps

  1. Curvature–spacetime bridge: Connect the Fisher curvature on Σ\Sigma to the spacetime curvature on M\mathcal{M}. The Gravity derivation provides the latter from coherence density gradients; the bridge would need to show how the observer embedding ι:ΣM\iota: \Sigma \to \mathcal{M} translates one curvature to the other. This is a research direction, not a derivation gap.
  2. Quantum Fisher metric: Extend from the classical Fisher metric to the quantum Fisher information (Bures metric / symmetric logarithmic derivative). This is needed for full quantum state spaces. The quantum Čencov theorem Petz, 1996 classifies monotone metrics but there is a family rather than a unique metric.
  3. Infinite-dimensional extension: The derivation assumes finite-dimensional Σ\Sigma. For field theory, the state space is infinite-dimensional and requires functional-analytic care Pistone & Sempi, 1995.

Addressed Gaps

  1. Monotonicity of the Hessian metric — Proved by Conservation of Distinguishability (Proposition 4.1): Axiom 1(i) → isometries → Čencov monotonicity. The identification g=Gg = \hbar G is fully rigorous.