Information-Geometric Energy Eciencyin Local Large Language Model Inference: Empirical Evidence from the Kerimov-Alekberli Framework on Apple Silicon

Rahid Alekberli; Hikmat Karimov

doi:10.20944/preprints202605.0855.v1

Submitted:

12 May 2026

Posted:

13 May 2026

You are already at the latest version

Abstract

Background: The thermodynamic cost of local large language model (LLM)inference on consumer hardware is poorly characterised. Unlike data-centre deployments with hardware power monitors (NVML, RAPL), Apple Silicon unied-memory systems require alternative instrumentation strategies, and no Landauer-grounded framework for local inference energy has previously been validated. Methods: We deploy seven open-source LLMs (2.020.2 GB; 3.2B32.8B parameters, Q4_K_M quantisation) on a single Apple M5 MacBook Pro (32 GBunied memory, 25 GB Metal GPU VRAM) via Ollama v0.23.2, instrumentingthe system with a custom telemetry daemon (1.5 s polling; top, vm_stat, ioreg,ps). We apply the Kerimov-Alekberli (KA) information-geometric framework, which monitors KL divergence between consecutive output distributions relative to a Fisher Information Metric (FIM)-derived threshold (τ = 0.065), and compare energy consumption against an unoptimised baseline using a unied Pythoncode-generation benchmark. Energy estimates are grounded in Landauer's ther-modynamic lower bound E_min = k_BTln 2, scaled macroscopically by an empirical power-size model. Results: KA achieves a consistent 38 % energy reduction across all seven models, saving 59 mJ (llama3.2, 2.0 GB, 55.7 tok/s) to 32,841 mJ(qwen3:32b, 20.2 GB, 2.6 tok/s) per run. Measured power draw follows a linear modelP = 5.0 + 0.75 SGB W (R² = 0.97). Token eciency under KAranges from 1,321 tok/J (qwen3:32b) to 8,287 tok/J (llama3.2). The First-PassageTime (FPT) anomaly detector recorded 602 KL-divergence threshold exceed ancesacross 9,501 total inference tokens; the highest-energy model (qwen3:32b) regis-tered 562 anomalies and the greatest absolute saving. Conclusions: These results constitute the first empirical validation of a Landauer-grounded energy reduction mechanism in local LLM inference via an information-geometric output-distribution stabilisation framework, with extrapolated annual savings of 105.4 kJand 11.7 mg CO₂ per workstation.

Keywords:

large language models

;

energy efficiency

;

landauer principle

;

information geometry

;

fisher information metric

;

Apple M5 silicon

;

local inference

;

KL divergence

;

AI safety

;

sustainable AI

;

edge computing

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Information-Geometric Energy Eciencyin Local Large Language Model Inference: Empirical Evidence from the Kerimov-Alekberli Framework on Apple Silicon

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe