Background: The thermodynamic cost of local large language model (LLM)inference on consumer hardware is poorly characterised. Unlike data-centre deployments with hardware power monitors (NVML, RAPL), Apple Silicon unied-memory systems require alternative instrumentation strategies, and no Landauer-grounded framework for local inference energy has previously been validated. Methods: We deploy seven open-source LLMs (2.020.2 GB; 3.2B32.8B parameters, Q4_K_M quantisation) on a single Apple M5 MacBook Pro (32 GBunied memory, 25 GB Metal GPU VRAM) via Ollama v0.23.2, instrumentingthe system with a custom telemetry daemon (1.5 s polling; top, vm_stat, ioreg,ps). We apply the Kerimov-Alekberli (KA) information-geometric framework, which monitors KL divergence between consecutive output distributions relative to a Fisher Information Metric (FIM)-derived threshold (τ = 0.065), and compare energy consumption against an unoptimised baseline using a unied Pythoncode-generation benchmark. Energy estimates are grounded in Landauer's ther-modynamic lower bound Emin = kBTln 2, scaled macroscopically by an empirical power-size model. Results: KA achieves a consistent 38 % energy reduction across all seven models, saving 59 mJ (llama3.2, 2.0 GB, 55.7 tok/s) to 32,841 mJ(qwen3:32b, 20.2 GB, 2.6 tok/s) per run. Measured power draw follows a linear modelP = 5.0 + 0.75 SGB W (R2 = 0.97). Token eciency under KAranges from 1,321 tok/J (qwen3:32b) to 8,287 tok/J (llama3.2). The First-PassageTime (FPT) anomaly detector recorded 602 KL-divergence threshold exceed ancesacross 9,501 total inference tokens; the highest-energy model (qwen3:32b) regis-tered 562 anomalies and the greatest absolute saving. Conclusions: These results constitute the first empirical validation of a Landauer-grounded energy reduction mechanism in local LLM inference via an information-geometric output-distribution stabilisation framework, with extrapolated annual savings of 105.4 kJand 11.7 mg CO2 per workstation.