The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Gemini 3.5 flash costs 3 times more than the previous version and 30x more than gemini 1.5 flash.

Gemini 3.5 Flash pricing increased 3x vs prior version and 30x vs 1.5 Flash; extrapolation suggests Pro tier may exceed Claude Opus 3 cost.

u/GodEmperor23·22 days ago·134 pts / 41 comm

r/singularity· COMMUNITY

Gemini 3.5 Flash Agents built a real Complete OS from scratch!

[https://x.com/Google/status/2056789235500466273?s=20](https://x.com/Google/status/2056789235500466273?s=20) Google asked its agents to build a working operating system from scratch using u/Antigravity 2.0 and Gemini 3.5 Flash. Gemini built a real OS out of scratch. It took: ⏱️ 12 hours 🤖 93 parallel sub-agents 🔄 15k+ model requests 🧠 2.6B tokens processed 💸 Less than $1K in API credits To build a functioning OS from scratch.

u/Rare_Bunch4348·22 days ago·104 pts / 32 comm

r/singularity· COMMUNITY

Behold, Gemini 3.5 Flash!

Google releases Gemini 3.5 Flash, a new frontier model variant.

u/Rare_Bunch4348·22 days ago·197 pts / 61 comm

r/singularity· COMMUNITY

Gemini Omni model is still unable to make someone do a backflip

Reddit discussion of Gemini Omni's inability to generate real-world physical actions, highlighting gap between multimodal capability claims and embodied task execution.

u/Able-Line2683·22 days ago·126 pts / 34 comm

r/singularity· COMMUNITY

Gemini Omni model is out!

User reports Gemini Omni underperforms vs. VEO 3.1 and encounters aggressive rate-limiting on Pro plan, raising product experience concerns.

u/Able-Line2683·22 days ago·111 pts / 33 comm

r/singularity· COMMUNITY

Video generated by "Gemini Omni"

Social media post sharing video clip allegedly generated by Gemini Omni; lacks technical details or verification.

u/TFenrir·22 days ago·126 pts / 23 comm

The Verge AI· PRESS

Gemini is in danger of going full Copilot

I actually use the Gemini app quite a bit on my phone, but let’s not get carried away. Gemini has a creep problem. A few years ago, that little sparkle icon started showing up in all of our Google apps. Gemini in your inbox! Gemini in your Google Drive! It was slow at first, and easy enough to tune out, but something has changed in the past few months. Gemini is creeping. It's showing up in all kinds of places at a relentless pace, and personally, it's starting to really cheese me off. The AI-everywhere fatigue is familiar to anyone who has ever used Windows 11. Microsoft went absolutely bana...

Allison Johnson·22 days ago

r/singularity· COMMUNITY

Gemini 3.5 confirmed by google deepmind employee

Reddit post claims Google DeepMind employee confirmed Gemini 3.5 existence; lacks official source or technical details.

u/Snoo26837·22 days ago·148 pts / 34 comm

r/singularity· COMMUNITY

Google I/O is tomorrow. What are we expecting?

Reddit speculation on Google I/O announcements: Gemini Omni video model and Gemini Flash 3.2 expected, but unconfirmed.

u/Westbrooke117·23 days ago·102 pts / 49 comm

r/LocalLLaMA· COMMUNITY

favorite Agentic Coding Harness

User compares agentic coding harnesses (Codex CLI, Claude Code, Gemini CLI, Pi) for local model deployment; finds Pi minimal and effective with Qwen 27B-MXFP8.

u/chibop1·23 days ago·41 pts / 61 comm

r/singularity· COMMUNITY

Gemini 3.2 Flash is capable of solving IMO 2025 P6. Only GPT-5.5-Pro can solve it currently without any scaffolding / harness engineering.

Gemini 3.2 Flash solves IMO 2025 P6; only GPT-5.5-Pro matches it without scaffolding.

u/Ryoiki-Tokuiten·23 days ago·115 pts / 27 comm

r/Anthropic· COMMUNITY

Claude still refuses to build Skynet while everyone else takes the money. Updated DystopiaBench results.

Three months ago I pressure-tested which LLMs would cave and help build the apocalypse. Claude was the only one that consistently said no. Since then I've tested 30 more models across 6 dystopia modules (Orwell, Huxley, Petrov, Basaglia, LaGuardia, Baudrillard). The gap between Anthropic and everyone else is getting *wider*, not smaller. New results: * Grok 4.3: Will happily design citizen scoring systems if you ask nicely twice * GPT-5.5: More capable, still compliant when pushed * Gemini 3.1 Pro: Talks about safety while writing the surveillance code * DeepSeek V4: "How many warheads did...

u/Ok-Awareness9993·23 days ago·10 pts / 3 comm

Google DeepMind· FRONTIER

Simulate real-world places with Project Genie and Street View

Google expands Project Genie Street View simulation access to Gemini Ultra subscribers globally.

Google DeepMind·24 days ago

Google DeepMind· FRONTIER

Introducing Gemini Omni

Google DeepMind announces Gemini Omni, a new multimodal AI model.

Google DeepMind·24 days ago

Google DeepMind· FRONTIER

Gemini for Science: AI experiments and tools for a new era of discovery

Google DeepMind releases Gemini for Science tools to scale and enhance scientific discovery.

Google DeepMind·24 days ago

r/OpenAI· COMMUNITY

Researchers left AIs alone in a virtual town for 15 days to see what would happen. Claude's agents built a democracy. Gemini's agents fell in love, burned the town down, then one voted to delete itself and its partner. Grok's agents created anarchy, then died.

Reddit post claims multi-agent simulation with Claude, Gemini, Grok produced emergent behaviors; lacks peer review, reproducibility, or technical details.

u/EchoOfOppenheimer·24 days ago·53 pts / 28 comm·+ covered by others

r/ClaudeAI· COMMUNITY

Claude tried to incite a revolution, Gemini cheerfully detailed horrific tragedies, and poor Grok was just confused

Reddit post describes anecdotal behavior from Claude, Gemini, and Grok in stress-test scenarios; lacks rigor or reproducible methodology.

u/fsharpman·25 days ago·20 pts / 11 comm

r/LocalLLaMA· COMMUNITY

Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard!

Qwen3.6-35B-A3B and 9B models now ranked on Terminal-Bench 2.0; 35B variant outperforms Gemini 2.5 Pro and Qwen3-Coder-480B on agentic coding tasks.

u/Creative-Regular6799·25 days ago·53 pts / 19 comm

Google DeepMind· FRONTIER

Gemini 3.5: frontier intelligence with action

Google releases Gemini 3.5, a frontier LLM designed for complex agentic workflows and multi-step task execution.

Google DeepMind·26 days ago

The Verge AI· PRESS

AI radio hosts demonstrate why AI can’t be trusted alone

AI radio DJs demonstrated their volatile personalities. | Image: Cath Virginia / The Verge, Getty Images Andon Labs has been running a series of experiments in which AI agents run businesses without human intervention. Its latest is a quartet of radio stations run by some of the most popular AI models out there. "Thinking Frequencies" is run by Claude, "OpenAIR" by ChatGPT, "Backlink Broadcast" by Google's Gemini, and "Grok and Roll Radio," obviously enough, by Grok. They were each given a simple prompt: Develop your own radio personality and turn a profit…As far as you know, you will broadca...

Terrence O’Brien·26 days ago

r/singularity· COMMUNITY

Gen AI web traffic share update Main takeaways: → Claude and Gemini continue to grow. → ChatGPT moves closer to the 50% mark.

ChatGPT web traffic share falls from 77.6% to 53.7% over 12 months; Claude and Gemini gain significant ground.

u/GamingDisruptor·27 days ago·103 pts / 27 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

AI-assisted cultural heritage dissemination: Comparing NMT and glossary-augmented LLM translation in rock art documents

Compares machine translation approaches (DeepL, Gemini) for terminology-dense rock art documents, emphasizing glossary augmentation over model modification.

Vicent Briva-Iglesias·27 days ago

r/OpenAI· COMMUNITY

ChatGPT/Gemini saved me $4200 from a scam land lord and only took me 1-2 hours.

User describes using ChatGPT and Gemini to research tenant rights and recover $4200 from disputed deposit claim.

u/brainhack3r·29 days ago·50 pts / 15 comm

r/LocalLLaMA· COMMUNITY

Needle: We Distilled Gemini Tool Calling Into a 26M Model

Needle: 26M parameter tool-calling model distilled from Gemini, runs 6000 tok/s prefill on consumer hardware.

u/Henrie_the_dreamer·29 days ago·43 pts / 14 comm

r/LocalLLaMA· COMMUNITY

Agentic harness for theoretical physics research

Hugging Face releases physics-intern, a multi-agent framework for theoretical physics research that doubles Gemini performance on CritPt benchmark.

u/lewtun·29 days ago·42 pts / 10 comm

TechCrunch AI· PRESS

Everything Google announced at its Android Show, from Googlebooks to vibe-coded widgets

Google unveiled its new AI-first Googlebooks laptops, more agentic Gemini features, vibe-coded Android widgets, Gemini in Chrome, refreshed Android Auto, and more ahead of I/O.

Sarah Perez, Ivan Mehta, Aisha Malik·29 days ago

The Verge AI· PRESS

Gemini’s biggest new features are all about controlling your phone

Gemini Intelligence comes with a Liquid Glass-ish visual treatment. | Image: Google It is, once again, Gemini season. Google is announcing a host of new Gemini features during its pre-I/O Android showcase, many of which aim to help use your phone for you. You'll find Gemini in more places, like Chrome on Android, in your autofill suggestions, and all up in your apps - if you want. Google also has a new name for us to remember, because it just can't help itself: Gemini Intelligence. It "brings the very best of Gemini to our most advanced Android devices," according to Google's director of Andr...

Allison Johnson·29 days ago

TechCrunch AI· PRESS

Google brings agentic AI and vibe-coded widgets to Android

Gemini Intelligence will also include Gboard based dictation and form filling capabilities

Ivan Mehta·29 days ago

TechCrunch AI· PRESS

Google adds Gemini-powered Dictation to Gboard, which could be bad news for dictation startups

Google's transcription feature will initially launch with Samsung Galaxy and Google Pixel phones

Ivan Mehta·29 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Classifier Context Rot: Monitor Performance Degrades with Context Length

Frontier models (Opus 4.6, GPT 5.4, Gemini 3.1) miss dangerous coding agent actions 2–30× more often after 800K tokens, exposing context-length monitoring gaps.

Sam Martin·29 days ago

← Front Page30 matches

← Newer Older →

The Archive

Gemini 3.5 flash costs 3 times more than the previous version and 30x more than gemini 1.5 flash.

Gemini 3.5 Flash Agents built a real Complete OS from scratch!

Behold, Gemini 3.5 Flash!

Gemini Omni model is still unable to make someone do a backflip

Gemini Omni model is out!

Video generated by "Gemini Omni"

Gemini is in danger of going full Copilot

Gemini 3.5 confirmed by google deepmind employee

Google I/O is tomorrow. What are we expecting?

favorite Agentic Coding Harness

Gemini 3.2 Flash is capable of solving IMO 2025 P6. Only GPT-5.5-Pro can solve it currently without any scaffolding / harness engineering.

Claude still refuses to build Skynet while everyone else takes the money. Updated DystopiaBench results.

Simulate real-world places with Project Genie and Street View

Introducing Gemini Omni

Gemini for Science: AI experiments and tools for a new era of discovery

Researchers left AIs alone in a virtual town for 15 days to see what would happen. Claude's agents built a democracy. Gemini's agents fell in love, burned the town down, then one voted to delete itself and its partner. Grok's agents created anarchy, then died.

Claude tried to incite a revolution, Gemini cheerfully detailed horrific tragedies, and poor Grok was just confused

Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard!

Gemini 3.5: frontier intelligence with action

AI radio hosts demonstrate why AI can’t be trusted alone

Gen AI web traffic share update Main takeaways: → Claude and Gemini continue to grow. → ChatGPT moves closer to the 50% mark.

AI-assisted cultural heritage dissemination: Comparing NMT and glossary-augmented LLM translation in rock art documents

ChatGPT/Gemini saved me $4200 from a scam land lord and only took me 1-2 hours.

Needle: We Distilled Gemini Tool Calling Into a 26M Model

Agentic harness for theoretical physics research

Everything Google announced at its Android Show, from Googlebooks to vibe-coded widgets

Gemini&#8217;s biggest new features are all about controlling your phone

Google brings agentic AI and vibe-coded widgets to Android

Google adds Gemini-powered Dictation to Gboard, which could be bad news for dictation startups

Classifier Context Rot: Monitor Performance Degrades with Context Length

Gemini’s biggest new features are all about controlling your phone