The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Opus is ridiculous for frontend cleanup

I love Opus. First I tuned one page, got the PageSpeed result where I wanted it, and wrote the whole thing down in `ADR_pagespeed-l0-fixes-playbook.md`. Then I opened a fresh session, gave it the remaining 9 pages, and pointed it at the playbook. Opus created three subagents by itself, split the work between them, and about 15 minutes later they had touched 41 frontend files that powered those pages. Same result across the set. Basically perfect Lighthouse numbers again. Not gonna lie, this is the kind of workflow where I stop thinking “chatbot” and start thinking “tiny frontend team tha...

u/Alex-S-Hamilton·25 days ago·34 pts / 7 comm

r/ClaudeAI· COMMUNITY

I replicated Anthropic's Generator-Evaluator harness to build a website through 12 adversarial AI iterations - here's the result and what I learned

Anthropic recently published their [harness design for long-running apps](https://www.anthropic.com/engineering/harness-design-long-running-apps) — a multi-agent architecture inspired by GANs where a Generator builds code and an Evaluator critiques it in a loop. I built my own version using Kiro CLI and used it to generate a marketing website for my project [Mnemo](https://github.com/Mnemo-mcp/Mnemo) (persistent memory for AI coding agents). **The architecture:** Planner (runs once) → Generator ↔ Evaluator (12 iterations) Each agent is a separate CLI process with zero shared context. Th...

u/killerexelon·25 days ago·20 pts / 6 comm

r/singularity· COMMUNITY

Mistral AI founder to French Parliament: "Engineers at Mistral no longer write a single line of code

Mistral AI founder tells French Parliament that engineers now manage AI agents writing code instead of writing it themselves, marking a shift in developer workflows.

u/Many_Consequence_337·26 days ago·106 pts / 60 comm

The Verge AI· PRESS

OpenAI keeps shuffling its executives in bid to win AI agent battle

OpenAI announced yet another reorganization Friday, consolidating certain areas and making company president Greg Brockman the official lead of all things product. In a memo viewed by The Verge, Brockman wrote that since OpenAI's product strategy for this year is to go all-in on AI agents, the company is combining its products to "invest in a single agentic platform and to merge ChatGPT and Codex into one unified agentic experience for all." To do this, the company is making a suite of org chart changes, although it's still operating under some of the same ones from last month. That's when AG...

Hayden Field·26 days ago

The Archive

Opus is ridiculous for frontend cleanup

I replicated Anthropic's Generator-Evaluator harness to build a website through 12 adversarial AI iterations - here's the result and what I learned

Mistral AI founder to French Parliament: "Engineers at Mistral no longer write a single line of code

OpenAI keeps shuffling its executives in bid to win AI agent battle

A Generative AI Framework for Intelligent Utility Billing CO 2 Analytics and Sustainable Resource Optimisation

FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast

Argus: Evidence Assembly for Scalable Deep Research Agents

Confirming Correct, Missing the Rest: LLM Tutoring Agents Struggle Where Feedback Matters Most

paper.json: A Coordination Convention for LLM-Agent-Actionable Papers

AI radio hosts demonstrate why AI can’t be trusted alone

Look Before You Leap: Autonomous Exploration for LLM Agents

ShopGym: An Integrated Framework for Realistic Simulation and Scalable Benchmarking of E-Commerce Web Agents

Came home to find Pi with Qwen3.627B had run rm -rf .....

Claude Code CLI for normal users will work. I don't get agentic SDK drama of some people

Building a safe, effective sandbox to enable Codex on Windows

Not so locked in any more

FutureSim: Replaying World Events to Evaluate Adaptive Agents

Self-Distilled Agentic Reinforcement Learning

From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents

VS Code's new "Agents window" lets you use local AI models. Still requires an Internet connection and a Github Copilot plan (because we can't have nice things)

SpeakerLLM: A Speaker-Specialized Audio-LLM for Speaker Understanding and Verification Reasoning

Orchard: An Open-Source Agentic Modeling Framework

WARD: Adversarially Robust Defense of Web Agents Against Prompt Injections

Do Coding Agents Understand Least-Privilege Authorization?

Beyond AI as Assistants: Toward Autonomous Discovery in Cosmology

Known By Their Actions: Fingerprinting LLM Browser Agents via UI Traces

$π$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows

Agentic Design of Compositional Descriptors via Autoresearch for Materials Science Applications

[AINews] Codex Rises, Claude Meters Programmatic Usage

Open-source, self-updating wiki for your codebase