Preprint
Article

This version is not peer-reviewed.

Operationalising the Kerimov–Alekberli Framework for Edge LLM Monitoring: A Phase 1 Surface-Proxy Token-Budget Gating Study on Apple Silicon

Submitted:

28 May 2026

Posted:

29 May 2026

You are already at the latest version

Abstract
Background. Deploying large language models (LLMs) at the edge introduces distributional output drift that existing monitoring approaches cannot detect within the latency and resource constraints of safety-critical autonomous systems [3,4]. The Kerimov–Alekberli (K–A) information-geometric framework proposes a First-Passage Time (FPT) criterion grounded in the Fisher Information Metric (FIM) to detect such drift [5,6]. No multi-run, statistically characterised empirical validation of K–A on edge hardware has previously been reported. Methods. We present a Phase 1 proxy-KL validation of the K–A proxy-gated token-budget criterion across five open-source LLMs (2.0–17.4 GB, Q4_K_M quantisation) deployed via Ollama v0.23.2 on an Apple M5 unified-memory workstation (32 GB, macOS 26.0). A response-level proxy instability score ˆDKL(r) = max(0.004, 0.016+h(r)·0.015+0.10/(w(r)+1) is computed on a completed baseline response; if it exceeds τFIM = 0.065 (above-FIM), a separate capped-regeneration call with Nka = ⌊Nbase/2⌋ provides a counterfactual token-budget estimate. Energy is proxy-estimated via ˆPm = Pbase + βSGB (R2 = 0.97). Results. After exclusion of 14 degenerate evaluations (6.4% of 220 above-FIM cases), Pearson r = 0.806 and Spearman ρ = 0.728 (n = 28, p<0.001) between FPT trigger rate and token saving confirm implementation consistency. Bootstrap 95% CIs: llama3.2 34.0 ± 4.0% [31.9, 36.3] (n = 12); gemma3:latest 34.6 ± 2.9% [32.5, 36.6] (n = 6); gemma3:27b 30.8 ± 5.7% [27.4, 34.8] (n = 8). Supplementary controlled validation (370 stored-response evaluations) confirms 100% exact-match quality for factual prompts, and reveals zero proxy-FPT triggers under deterministic and fixed-seed decoding. Conclusions. The K–A surfaceproxy proxy-gated criterion produces statistically characterised token reductions across three model families under stochastic decoding. A key central limitation: the surface proxy requires stochastic response-length variation to trigger; it does not detect geometric distributional instability. Phase 2 must replace the surface proxy with direct logit-level DKL computation.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated