Vol. I · No. 52WED, JUN 10, 2026
Archive

The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Stop wasting electricity

RTX 4090 power optimization for llama.cpp: reduce consumption 40% via power limits without performance loss.

··

ExLlamaV3 Major Updates!

ExLlamaV3 adds Gemma 4 support, improved caching, and DFlash optimization for faster LLM inference on consumer hardware.

··

I have DeepSeek V4 Pro at home

User successfully quantized and ran DeepSeek V4 Pro locally on AMD EPYC + RTX PRO hardware using modified llama.cpp with Q4_K_M compression.

··

I am overwhelmed by Harnesses

Reddit user seeks advice on LLaMA inference harnesses; discusses fragmentation and compatibility issues with local LLM tooling.

··

Meta sued by major book publishers over copyright infringement

Meta is facing a class action lawsuit filed by five major book publishers and one author over claims the company "engaged in one of the most massive infringements of copyrighted materials in history" when training its Llama AI models, as reported earlier by The New York Times. In their suit, Macmillan, McGraw-Hill, Elsevier, Hachette, Cengage, and author Scott Turow allege that Meta "repeatedly copied" their books and journal articles without permission. The lawsuit accuses Meta of knowingly ripping copyrighted work from "notorious pirate sites," such as LibGen, Anna's Archive, Sci-Hub, Sci-M...

·

OllamaXClaude

Unexpected email to wake up to but I am here for it! Model agnostic tools are the way! This is huge!

··

Llama.cpp MTP support now in beta!

llama.cpp adds beta MTP (Multi-Token Prediction) support, starting with Qwen3.5, closing performance gap with vLLM on token generation.

··
30 matches