The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

U.K. regulators are requiring Google offer a tool allowing website publishers to opt-out of generative AI search features. The option will be tested in the UK then rolled out globally.

Sarah Perez·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SAID: Accelerating Diffusion-Based Language Models via Scaffold-Aware Iterative Decoding

Diffusion large language models (DLLMs) enable non-autoregressive generation by iteratively denoising corrupted token sequences with bidirectional context. Despite their ability to update multiple positions in parallel, inference remains costly due to the many denoising steps required for high-quality generation. We propose SAID, a Scaffold-Aware Iterative Decoding framework that accelerates DLLMs by reallocating computation across tokens. SAID first spends denoising computation on scaffold tokens to establish the coarse semantic structure, and then completes predictable detail tokens with fe...

Na Li·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Be Fair! Can Machine Learning Engineering Agents Adhere to Fairness Constraints?

Machine learning engineering (MLE) agents promise to automate end-to-end ML pipeline development from raw data and natural language instructions, potentially making ML accessible to non-technical domain experts. However, in sensitive and regulated domains, this abstraction creates a responsibility gap: end-users may lack visibility into design choices that affect correctness, robustness, fairness, and regulatory compliance. We argue that existing benchmarks are insufficient to assess whether MLE agents can be safely applied in such settings. We propose desiderata for a responsibility-centered...

Anna Richter·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Plan, Watch, Recover: A Benchmark and Architectures for Proactive Procedural Assistance

We envision a proactive multi-modal assistant system which gives users real-time step-by-step guidance on a procedural task, autonomously deciding \textit{when} to interrupt, and \textit{how} to coach. However, progress is limited by the absence of large-scale, cross-domain benchmarks that reflect realistic conditions, particularly the common case in which users deviate from the expected step sequence. We address this gap with four contributions: \textbf{(1)}~we release \textbf{EgoProactive}, a large-scale wearable-egocentric dataset for proactive procedural assistance with explicit Out-of-Pl...

Kaustav Kundu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From Prompt to Process: a Process Taxonomy and Comparative Assessment of Frameworks Supporting AI Software Development Agents

AI tools for programming are no longer just autocomplete or chat assistants: they organize themselves as development frameworks, with process, roles, artifacts and verification. Recent surveys map agents and LLMs for software engineering, but a study centered on the operational frameworks that turn these capabilities into process is missing. We ran a directed search of primary sources, with a functional inclusion criterion and traction measurement, and selected six frameworks: GitHub Spec Kit, OpenSpec, BMAD Method, Get Shit Done (GSD), Spec Kitty and Reversa. Each attacks AI development thro...

Sanderson Oliveira de Macedo·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SemBlock: Semantic Boundary Dynamic Blocks for Diffusion LLMs

Diffusion language models (DLMs) generate text through iterative denoising, and blockwise decoding improves their practicality by committing tokens in local blocks. However, existing blockwise methods typically rely on fixed block sizes or delimiter-based runtime signals, which do not necessarily align with semantic boundaries. In this paper, we propose SemBlock, a semantic-boundary-driven dynamic block decoding framework for diffusion LLMs. SemBlock formulates dynamic block construction as semantic boundary prediction and trains lightweight predictors on frozen LLaDA hidden states. To provid...

Xinrui Song·2 months ago

The Verge AI· PRESS

Microsoft and OpenAI broke up — now they’re ready to fight

At Microsoft's annual Build conference on Tuesday, the company announced a slew of new or expanded AI initiatives, including a super app, in-house reasoning models, a cybersecurity tool, and OpenClaw-esque AI agents. All this news added up to a clear message: Microsoft is positioned to be one of the biggest players in AI, and it's finally acting like it. For years, Microsoft's AI business leaned hard on its early and exclusive partnership with OpenAI. But the drama-filled marriage slowly devolved into a situationship, and the pair effectively separated in late April (though Microsoft is still...

Hayden Field·2 months ago

TechCrunch AI· PRESS

Meta’s AI agent for WhatsApp Business is now available globally

WhatsApp will charge businesses for using its AI agent based on token usage

Ivan Mehta·2 months ago

Ars Technica AI· PRESS

Inside Meta's attempts to play catch-up with AI

Doubts linger over whether Meta can close the gap with rivals.

Hannah Murphy, Financial Times ·2 months ago

OpenAI· FRONTIER

Introducing new capabilities to GPT-Rosalind

GPT-Rosalind advances life sciences research with enhanced biological reasoning, medicinal chemistry expertise, genomics analysis, and experimental workflow capabilities.

OpenAI·2 months ago

TechCrunch AI· PRESS

Coralogix raises $200M on bet that someone needs to watch the AI agents

The Series F round values Coralogix at $1.6 billion and comes less than a year after its previous raise.

Jagmeet Singh·2 months ago

Anthropic· FRONTIER

Introducing the Services Track and Partner Hub of the Claude Partner Network

Anthropic·2 months ago

Google AI (Gemma)· FRONTIER

5 ways Google Search can level up your thrift and vintage shopping

Uncover second-hand scores with AI tools in Google Search and Shopping.

{"$":{"xmlns:author":"http://www.w3.org/2005/Atom"},"name":["Megan Stoner"],"title":["Keyword Contributor"],"department":[""],"company":[""]}·2 months ago

Hugging Face· INFRA

Direct Preference Optimization Beyond Chatbots

Hugging Face·2 months ago

Simon Willison· ANALYST

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs I wrote the other day about Uber blowing its 2026 AI budget in four months, and how that wasn't particularly surprising given they would have set that budget in 2025, before anyone could have predicted how popular token-burning coding agents were about to become. Natalie Lung for Bloomberg: The rideshare giant is limiting all employees to $1,500 in monthly token spending per AI coding tool, an Uber spokesperson said in response to a Bloomberg News inquiry. That means spending on one tool doesn’t have a bearing on the budget for anot...

Simon Willison·2 months ago

OpenAI· FRONTIER

How Wasmer used Codex to build a Node.js runtime for the edge

See how Wasmer used Codex with GPT-5.5 to build a Node.js runtime for the edge, accelerating development 10x to 20x and shipping in weeks instead of months.

OpenAI·2 months ago

Anthropic· FRONTIER

What we learned mapping a year’s worth of AI-enabled cyber threats

As AI transforms the nature of and methods behind cyberattacks, how well do the techniques and frameworks used by the security community hold up? In a new report, we seek to answer that question.

Anthropic·2 months ago

Stratechery· ANALYST

The Nvidia AI PC, Project Solara, Microsoft AI

The Nvidia AI PC feels like a relic of another AI era; Microsoft's vision for devices at Build was much more compelling.

Ben Thompson·2 months ago

OpenAI· FRONTIER

OpenAI public policy agenda

OpenAI outlines its public policy agenda for AI, including safety, youth protection, workforce transition, and global standards to ensure AI benefits society.

OpenAI·2 months ago

OpenAI· FRONTIER

A blueprint for democratic governance of frontier AI

OpenAI outlines a blueprint for U.S. governance of frontier AI, proposing a federal framework for safety, resilience, and national security.

OpenAI·2 months ago

The Verge AI· PRESS

AI has a water problem. Google thinks it has a fix

In the face of widespread backlash to the AI data center buildout throughout the US, Google is touting its efforts to minimize the environmental impact by actually increasing water for local communities. The company laid out five commitments around water use in a new blog post published Wednesday, including a goal to replenish more water than it uses at its data centers by 2030. Google also said it will invest in local water infrastructure, identify alternative water sources to power its facilities, and be transparent about its water use overall. "We're just one of dozens of players in the sp...

Lauren Feiner·2 months ago

The Verge AI· PRESS

Google must let publishers opt out of AI Search features, rules UK

Online publishers are getting more control over whether their websites appear in Google's AI Search features, thanks to a UK regulatory ruling. The new conduct rule imposed by the Competition and Markets Authority (CMA) requires Google to let website owners keep their content out of features like AI Overviews, and prevent it from being used for the "fine-tuning" of Google's AI models. "In a world first, publishers will now have effective tools to prevent their content being used to power AI features in search, such as AI Overviews," the CMA announced. "This will put publishers, like news orga...

Jess Weatherbed·2 months ago

Latent Space· ANALYST

[AINews] Microsoft Build: MAI-Thinking-1 and MAI Family models

Microsoft Build recap, and new MAI model technical details

Latent Space·2 months ago

Cohere· FRONTIER

Why more businesses choose private deployments of AI

Private deployment offers more peace of mind from data security risks. Learn how to tackle the complexities to launch successfully.

Cohere·2 months ago

Hugging Face· INFRA

Adding MCP Tools to Reachy Mini

Hugging Face·2 months ago

Cohere· FRONTIER

From data chaos to clarity: Unlock enterprise AI value

To optimize AI solutions, companies need a tailored data strategy that addresses common challenges such as quality, connectivity, and scaling.

Cohere·2 months ago

Cohere· FRONTIER

Coplot: Supporting the research process through visualization

A blog about how Cohere Labs built coplot, a data visualization tool that not only helps their releases, but also their research process.

Cohere·2 months ago

TechCrunch AI· PRESS

Cyera eyes $12B valuation at 80x ARR multiple despite operating losses

The cybersecurity company is nearing a $300 million round led by Evolution Equity Partners.

Marina Temkin·2 months ago

Simon Willison· ANALYST

Microsoft's new MAI models

Microsoft announced two new text LLMs this morning - MAI-Thinking-1 (reasoning, 35B parameters, available to "select early partners") and MAI-Code-1-Flash (5B parameters, "purpose-built for GitHub Copilot and VS Code to deliver high performance and lower cost [...] rolling out to GitHub Copilot individual users in Visual Studio Code"). I've not been able to try either of them just yet. It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now. They claim MAI-Thinking-1 "is preferred to Sonnet 4....

Simon Willison·2 months ago

Ars Technica AI· PRESS

Microsoft's Project Solara is an Android OS designed for agents instead of apps

Microsoft missed the boat on apps, so get ready for agents.

Ryan Whitwam ·2 months ago

← Front Page30 stories

← Newer Older →

The Archive

Publishers will be able to opt out of AI Search, thanks to new regulation

SAID: Accelerating Diffusion-Based Language Models via Scaffold-Aware Iterative Decoding

Be Fair! Can Machine Learning Engineering Agents Adhere to Fairness Constraints?

Plan, Watch, Recover: A Benchmark and Architectures for Proactive Procedural Assistance

From Prompt to Process: a Process Taxonomy and Comparative Assessment of Frameworks Supporting AI Software Development Agents

SemBlock: Semantic Boundary Dynamic Blocks for Diffusion LLMs

Microsoft and OpenAI broke up — now they’re ready to fight

Meta’s AI agent for WhatsApp Business is now available globally

Inside Meta's attempts to play catch-up with AI

Introducing new capabilities to GPT-Rosalind

Coralogix raises $200M on bet that someone needs to watch the AI agents

Introducing the Services Track and Partner Hub of the Claude Partner Network

5 ways Google Search can level up your thrift and vintage shopping

Direct Preference Optimization Beyond Chatbots

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs

How Wasmer used Codex to build a Node.js runtime for the edge

What we learned mapping a year’s worth of AI-enabled cyber threats

The Nvidia AI PC, Project Solara, Microsoft AI

OpenAI public policy agenda

A blueprint for democratic governance of frontier AI

AI has a water problem. Google thinks it has a fix

Google must let publishers opt out of AI Search features, rules UK

[AINews] Microsoft Build: MAI-Thinking-1 and MAI Family models

Why more businesses choose private deployments of AI

Adding MCP Tools to Reachy Mini

From data chaos to clarity: Unlock enterprise AI value

Coplot: Supporting the research process through visualization

Cyera eyes $12B valuation at 80x ARR multiple despite operating losses

Microsoft's new MAI models

Microsoft's Project Solara is an Android OS designed for agents instead of apps