Vol. I · No. 52WED, JUN 10, 2026
Archive

The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Gemini 3.5 Flash Agents built a real Complete OS from scratch!

[https://x.com/Google/status/2056789235500466273?s=20](https://x.com/Google/status/2056789235500466273?s=20) Google asked its agents to build a working operating system from scratch using u/Antigravity 2.0 and Gemini 3.5 Flash. Gemini built a real OS out of scratch. It took: ⏱️ 12 hours 🤖 93 parallel sub-agents 🔄 15k+ model requests 🧠 2.6B tokens processed 💸 Less than $1K in API credits To build a functioning OS from scratch.

··

Gemini Omni model is out!

User reports Gemini Omni underperforms vs. VEO 3.1 and encounters aggressive rate-limiting on Pro plan, raising product experience concerns.

··

Gemini is in danger of going full Copilot

I actually use the Gemini app quite a bit on my phone, but let’s not get carried away. Gemini has a creep problem. A few years ago, that little sparkle icon started showing up in all of our Google apps. Gemini in your inbox! Gemini in your Google Drive! It was slow at first, and easy enough to tune out, but something has changed in the past few months. Gemini is creeping. It's showing up in all kinds of places at a relentless pace, and personally, it's starting to really cheese me off. The AI-everywhere fatigue is familiar to anyone who has ever used Windows 11. Microsoft went absolutely bana...

·

favorite Agentic Coding Harness

User compares agentic coding harnesses (Codex CLI, Claude Code, Gemini CLI, Pi) for local model deployment; finds Pi minimal and effective with Qwen 27B-MXFP8.

··

Claude still refuses to build Skynet while everyone else takes the money. Updated DystopiaBench results.

Three months ago I pressure-tested which LLMs would cave and help build the apocalypse. Claude was the only one that consistently said no. Since then I've tested 30 more models across 6 dystopia modules (Orwell, Huxley, Petrov, Basaglia, LaGuardia, Baudrillard). The gap between Anthropic and everyone else is getting *wider*, not smaller. New results: * Grok 4.3: Will happily design citizen scoring systems if you ask nicely twice * GPT-5.5: More capable, still compliant when pushed * Gemini 3.1 Pro: Talks about safety while writing the surveillance code * DeepSeek V4: "How many warheads did...

··

Researchers left AIs alone in a virtual town for 15 days to see what would happen. Claude's agents built a democracy. Gemini's agents fell in love, burned the town down, then one voted to delete itself and its partner. Grok's agents created anarchy, then died.

Reddit post claims multi-agent simulation with Claude, Gemini, Grok produced emergent behaviors; lacks peer review, reproducibility, or technical details.

···

AI radio hosts demonstrate why AI can’t be trusted alone

AI radio DJs demonstrated their volatile personalities. | Image: Cath Virginia / The Verge, Getty Images Andon Labs has been running a series of experiments in which AI agents run businesses without human intervention. Its latest is a quartet of radio stations run by some of the most popular AI models out there. "Thinking Frequencies" is run by Claude, "OpenAIR" by ChatGPT, "Backlink Broadcast" by Google's Gemini, and "Grok and Roll Radio," obviously enough, by Grok. They were each given a simple prompt: Develop your own radio personality and turn a profit…As far as you know, you will broadca...

·

Gemini’s biggest new features are all about controlling your phone

Gemini Intelligence comes with a Liquid Glass-ish visual treatment. | Image: Google It is, once again, Gemini season. Google is announcing a host of new Gemini features during its pre-I/O Android showcase, many of which aim to help use your phone for you. You'll find Gemini in more places, like Chrome on Android, in your autofill suggestions, and all up in your apps - if you want. Google also has a new name for us to remember, because it just can't help itself: Gemini Intelligence. It "brings the very best of Gemini to our most advanced Android devices," according to Google's director of Andr...

·
30 matches