The Archive
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts
Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.
Serious concerns about latest version of Claude: it no longer obeys or respects CLAUDE.md, hooks/rules, etc.
Reddit user reports Claude ignoring custom CLAUDE.md configuration files and architectural guidelines in recent version, requesting TDD enforcement.
OpenAI CEO Sam Altman says Gen Z and millennials are using ChatGPT like a 'life advisor'—but college students might be one step ahead
Sam Altman observes Gen Z/millennials using ChatGPT as life advisor; college students adopt it more extensively.
Big data centers in Florida must pay full power and infrastructure costs under new law
Florida law requires large data centers to pay full power and infrastructure costs, raising operational expense for AI compute.
Claude just hallucinated again and changed the whole workflow of my app. Do not run them autonomously 24/7.
Claude Max still produces hallucinations causing production failures; autonomous agents unsuitable for unsupervised deployment without guardrails.
Hermes Agent is now #1 most used globally in past 24 hours in Openrouter global token metrics, above Claude Code and OpenClaw.
Hermes Agent surpassed Claude Code and OpenClaw as top token consumer on OpenRouter in past 24 hours.
MTP benchmark results: the nature of the generative task dictates whether you will benefit (coding) or get slower inference (creative) from speculative inference. No other factor comes close.
Benchmark study of 300+ tests shows speculative inference speeds up coding tasks but slows creative text generation; task type dominates performance.
I read threads complaining about claude every week... tf are y'alls workflows?
Software engineer argues Claude 4.7 reasoning is reliable when code ownership and human review are enforced; questions why complainants use AI for deterministic workflows.
Switched from OpenCode to Pi - What Settings/Plugins would you recommend?
Reddit user discusses migration from OpenCode to Pi IDE, seeking plugin recommendations for local LLM development.
After Shopify and Google said that 50% and 75% of their code is AI-generated, it’s now Airbnb’s turn to say that 60% of its codebase is also AI-generated. Moreover, Airbnb's CEO says that even managers are programming with Claude Code.
Airbnb reports 60% of new code AI-generated using Claude; CEO notes managers now code, extending trend after Shopify (50%) and Google (75%).
Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k context
User documents running Qwen3.6 35B A3B quantized model on RTX 4060 (8GB VRAM) with 190k context window at 37-51 tok/sec throughput.
DeepSeek-V4-Flash W4A16+FP8 with MTP self-speculation: 85 tok/s @ 524k on 2× RTX PRO 6000 Max-Q
DeepSeek-V4-Flash W4A16+FP8 with retrofitted MTP self-speculation achieves 85.5 tok/s @ 524k context on dual RTX PRO 6000 Max-Q via vLLM patching.
I Gave an AI Its Own Radio Station — It Won't Stop Broadcasting (It's Fine)
User deployed WRIT-FM, a 24/7 AI radio station using Claude/ChatGPT for real-time content generation across five distinct host personas.
Signals: finding the most informative agent traces without LLM judges [R]
Katanemo Labs proposes Signals, a lightweight method to identify high-value agent traces without LLM judge overhead, enabling cost-effective agent trajectory filtering.
Anybody else noticing how good gemma-4-26b-a4b is with one-shotting three.js?
User reports Gemma-4-26B excels at single-shot code generation for three.js visualizations via informal benchmark.
Claude Mythos literally broke the METR graph ("The most important chart in AI")
Reddit post claims Claude Mythos exceeded METR's time-horizon benchmark; lacks verification and uses hyperbolic framing.
Animation is solved. This is like Pixar level quality.
Reddit post claiming AI-generated animation has reached photorealistic quality comparable to Pixar films; lacks technical details or evidence.
Opus 4.7 truly reminds me of my juniors and interns
I use a bunch of LLMs, I hadn't used Opus 4.7 yet, decided to try it for a project this weekend. Dear lord, it's both great and so frustrating. I am working on a discography tracking project. I have the metadata providers wired in. I made a short plan with 4.7 Opus, very straight forward: 1) When an artist is added -> Call API end point for artist (contains artist info and discography) -> Add to DB each album and artist info from this payload 2) A recurring process that fetches up to date information based on the album ID contained in the previous payload, to get the track list, track ...
I made Claude Code aware of its own usage limits
Developer exposes Anthropic's rate-limit headers to Claude Code, enabling the model to be aware of its own quota consumption during sessions.
Opus said something today that completely reframed AI agent failures for me.
Claude Opus user observes that AI models apologizing for constraint violations without underlying changes leads to repeated failures in agentic workflows.
We’re feeling cynical about xAI’s big deal with Anthropic
On the latest episode of the Equity podcast, we discussed what xAI's deal with Anthropic might mean for parent company SpaceX.
Getting a feel for how fast X tokens/second really is.
Interactive tool to benchmark subjective inference speed of local LLMs across text, code, and reasoning tasks.
Quoting Andrew Quinn
Simon Willison quotes Andrew Quinn on overcoming knowledge paralysis in programming by embracing reinvention over exhaustive tool research.
Sonnet 4.5 finally going away :(
Reddit user reports perceiving personality/tone differences between Claude Sonnet 3.5 and 4.6, expressing preference for older version.
Claude Mythos vs GPT-5.5 Cyber
source : [https://x.com/pankajkumar\_dev/status/2053470332313301244?s=20](https://x.com/pankajkumar_dev/status/2053470332313301244?s=20)
Got parented by Claude
Reddit anecdote about Claude's conversational tone in an exchange; no technical or product substance.