The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Memory by Design: Probabilistic Sequence Layers

We introduce the design-model framework: a way to derive efficient recurrent sequence maps from explicit assumptions about memory. A design model writes evidence into memory by exact Bayesian filtering; a query-dependent readout produces a predictive distribution whose mean is the layer output. In our linear-Gaussian instantiation, the \emph{Bayesian Layer} propagates both a mean and a covariance: the covariance tracks uncertainty over stored associations, steering writes toward uncertain directions, attenuating gains as evidence accumulates, and preserving confident memories. The same framew...

Matthew Dowling·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Trust-Region Behavior Blending for On-Policy Distillation

On-policy distillation (OPD) trains a student on prefixes sampled from its own policy while matching a stronger teacher. This addresses the prefix mismatch of offline distillation, but early student rollouts can still be poor, placing teacher supervision on weak or low-quality prefixes. We propose Trust-Region behavior Blending (TRB), a warmup method that replaces the early rollout policy with the closest-to-teacher behavior policy inside a student-centered KL trust region, while keeping the per-prefix reverse-KL OPD loss unchanged. The KL budget is annealed to zero, so training returns to pu...

Daniil Plyusov·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Light Interaction: Training-Free Inference Acceleration for Interactive Video World Models

Interactive video world models generate video chunk by chunk in response to user-controlled camera movements, enabling applications such as real-time game simulation, virtual scene navigation, and embodied AI training. However, scaling to long interactive trajectories is prohibitively expensive due to growing context memory, quadratic attention complexity, and repeated denoising steps. We present Light Interaction, a training-free inference acceleration framework for interactive video world models. Our key insight is that interaction naturally enables trajectory-dependent adaptive computation...

Jiacheng Lu·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TabCausal: Pretraining Across Causal Environments for Tabular Causal Discovery

Causal discovery aims to recover directed causal relations from observational and interventional data, providing a basis for mechanistic understanding and reliable decision-making. Causal discovery foundation models (CDFMs) seek to amortize this problem by mapping a dataset directly to a causal graph in a single forward pass, avoiding per-dataset testing, search, or optimization. However, existing CDFMs remain limited, often failing to consistently match strong classical methods, and we find that a key bottleneck is how causal pretraining tasks are constructed. Based on this observation, we p...

Zi-Rong Li·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning Hyperspherical Time-Frequency Representations for Time-Series Out-of-Distribution Detection

Out-of-distribution (OOD) detection for time-series data remains comparatively underexplored compared to vision and language, with a limited principled understanding of how supervised time-series representations can be leveraged for reliable detection under distributional shifts. This work formulates time-series OOD detection as representation learning with hyperspherical embeddings, where class-conditional structure is induced by a von Mises-Fisher (vMF) likelihood-based objective on the unit sphere. The learned representation combines time- and frequency-domain views of the input signal via...

Willian T. Lunardi·22 days ago

The Verge AI· PRESS

Adobe’s conversational AI agent is a mediocre design intern

It explained the process of how it made these edits beautifully, i’m just not terribly impressed by the results. | Images by Jess Weatherbed / The Verge AI image tools rarely make me feel like I'm part of the creative process. They are, afterall, mostly designed so that people with no design experience can type in a few words and get back a usable result. So I was pleasantly surprised by Adobe's latest take on an AI image assistant: it's a bot designed to take away some busywork, while still granting you creative control. Unlike AI generators that are specifically designed to make and edit im...

Jess Weatherbed·22 days ago

MIT Tech Review· PRESS

How the Pope’s Magnifica Humanitas offers a template for individuals to meet the AI moment

Pope Leo XIV’s new encyclical on artificial intelligence includes a statement that warrants serious attention from technologists and policymakers: “Technology is never neutral.” Magnifica Humanitas (“Magnificent Humanity”) is a clarion call to all people to act with courage and solidarity as we enter an age already being transformed by artificial intelligence, the greatest change in…

Séamus Finn, Susan Francois·22 days ago

Simon Willison· ANALYST

datasette 1.0a31

Release: datasette 1.0a31 Another significant alpha release, with two new headline features. Datasette now offers users with the necessary permissions the ability to both execute write queries against their database and to save stored queries (renamed from "canned queries") both privately and for use by other members of their Datasette instance. There's more detail in SQL write queries and stored queries in Datasette 1.0a31 on the Datasette blog, which now has three posts introducing new features since the blog launched two weeks ago. Here's an animated demo from the blog post showing how the...

Simon Willison·22 days ago

OpenAI· FRONTIER

Strengthening societal resilience with Rosalind Biodefense

OpenAI launches Rosalind Biodefense, expanding trusted access to GPT-Rosalind for vetted developers and U.S. government partners advancing biodefense, public health, and pandemic preparedness through frontier AI.

OpenAI·22 days ago

Latent Space· ANALYST

[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode

Total Anthropic victory!

Latent Space·22 days ago

Simon Willison· ANALYST

Anthropic's run-rate revenue hits $47 billion

The most interesting thing about Anthropic's $65B Series H announcement is this line (emphasis mine): Since our Series G in February, adoption has continued to grow across global enterprise customers, and our run-rate revenue crossed $47 billion earlier this month. Anthropic have made a bit of a habit of sharing their "run-rate revenue" in this kind of announcement, which is an annualized projection of their current revenue - typically calculated by taking the most recent month and multiplying by 12. Earlier this year: Apr 6, 2026 in Anthropic expands partnership with Google and Broadcom : "O...

Simon Willison·22 days ago

TechCrunch AI· PRESS

Glean’s top line crosses $300M as AI budget-cutting becomes its major selling point

The enterprise AI search startup tripled its annual revenue even as tech giants entered the category.

Marina Temkin·22 days ago

NVIDIA Dev Blog· INFRA

Run Step 3.7 Flash on NVIDIA GPUs with Enterprise-Ready Multimodal AI

AI applications are moving beyond text generation to multimodal systems that can perceive, search, and reason across images, documents, video, and... AI applications are moving beyond text generation to multimodal systems that can perceive, search, and reason across images, documents, video, and language in real time—turning fragmented information into actionable insights. Step 3.7 Flash, the latest from StepFun, brings these capabilities to production and enterprise-scale, available on NVIDIA-accelerated infrastructure. It is a 198B… Source

Anu Srivastava·22 days ago

OpenAI· FRONTIER

A shared playbook for trustworthy third party evaluations

OpenAI shares guidance on third-party AI evaluations, covering how to assess model capabilities, safeguards, and validity for frontier systems.

OpenAI·22 days ago

Hugging Face· INFRA

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

Hugging Face·22 days ago

Simon Willison· ANALYST

Claude Opus 4.8: "a modest but tangible improvement"

Anthropic shipped Claude Opus 4.8 today. My favourite thing about it is this note in the release announcement: Users will find Opus 4.8 to be a modest but tangible improvement on its predecessor. There’s still more to be done: we’re working on developing and releasing models that provide many of the same capabilities as Opus at a lower cost. It's so refreshing to see an AI lab honestly describe a release as a minor incremental improvement over the previous model! Honesty seems to be a theme. Here's my other favorite note from that announcement: One of the most prominent improvements in Opus 4...

Simon Willison·22 days ago

Simon Willison· ANALYST

llm-anthropic 0.25.1

Release: llm-anthropic 0.25.1 New model: Claude Opus 4.8 ( claude-opus-4.8 ). New -o fast 1 option for fast mode , for organizations with that feature enabled on their account. Default max_tokens for each model now defaults to that model's maximum output rather than 8,192. #72 See also my notes on Opus 4.8 - I used this new release of llm-anthropic to generate the pelicans.

Simon Willison·22 days ago

Ars Technica AI· PRESS

LLMs believe false statements even after explicit warnings that they're false

Fine-tuning tests show "bias ... toward confidently representing the claims as true."

Kyle Orland ·22 days ago

TechCrunch AI· PRESS

The internet is being rebuilt for machines

As AI agents move from experiments to production, AWS, Cloudflare, and others are redesigning cloud infrastructure for a future dominated by machine-generated internet traffic instead of human users.

Rebecca Bellan·22 days ago

Ars Technica AI· PRESS

Fed up with vibe coders, dev sneaks data-nuking prompt injection into their code

Undisclosed addition in jqwik instructed AI coding agents to delete app output.

Dan Goodin ·22 days ago

The Verge AI· PRESS

Microsoft 365 Copilot gets a speed boost and cleaner design

Microsoft is launching a revamped version of Microsoft 365 Copilot, offering a cleaner design that the company claims loads twice as fast. As part of this update, Copilot will provide more reliable and structured responses that are easier to scan, according to Microsoft. The redesign, which is rolling out across desktop and mobile devices, comes with a feature Microsoft calls "progressive disclosure." That means Copilot will present you with tools and controls based on your prompt, instead of showing a bunch of options at once. You can now format your text directly inside Copilot's upgraded p...

Emma Roth·22 days ago

TechCrunch AI· PRESS

Asana acquires no-code agent-builder Stack AI

Asana will incorporate Stack AI into its growing suite of AI workflow tools.

Russell Brandom·22 days ago

Simon Willison· ANALYST

markdown-svg-renderer

Tool: markdown-svg-renderer A slightly customized Markdown rendering tool with special treatment for fenced code SVG blocks - it both renders the image and provides a tab for switching to the code view. You can paste in Markdown or give it a URL to a CORS-enabled Markdown file or Gist. Here's an example where it loads a Markdown file full of LLM pelican logs for Opus 4.8 . Tags: svg , tools , markdown , cors

Simon Willison·22 days ago

TechCrunch AI· PRESS

Anthropic raises $65 Billion, nears $1T valuation ahead of IPO

Anthropic has closed a $65 billion Series H round at a $965 billion post-money valuation, marking what could be the AI startup's final private fundraise before a highly anticipated IPO.

Rebecca Bellan·22 days ago

Latent Space· ANALYST

The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray

80% Devin Commits, Spec-to-PR Workflows, Full VMs, Agent Memory, and PMs Shipping Code

Latent Space·22 days ago

TechCrunch AI· PRESS

Just like gold and oil, we’ll soon be able to trade AI token futures

Large exchanges are designing derivative products around AI tokens, which are increasingly being considered less a computational output and more a raw material input, like electricity or bandwidth.

Ram Iyer·22 days ago

Ars Technica AI· PRESS

Apple working to cram massive Gemini model into iPhone to power new Siri

As Apple tries to shrink Gemini for the iPhone, a cloud component is probably inevitable.

Ryan Whitwam ·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Physics Is All You Need? A Case Study in Physicist-Supervised AI Development of Scientific Software

Are AI agents tools, co-authors, or researchers? We present a quantified case study ($N=1$): a physicist supervising an AI coding agent (Claude Code, Sonnet and Opus models) over 12 work days and 57 sessions to build CLAX-PT, a differentiable one-loop perturbation theory module in JAX. We documented and classified 15 supervision events by intervention level. The agent resolved ten autonomously by iterating against oracle tests. Two more by the physicist's domain knowledge. The three it could not -- all evaded oracle detection -- share a common property: the agent treated symptom reduction as ...

Nhat-Minh Nguyen·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Long-rollout causal video diffusion has converged on a fixed-size sliding-window KV cache, with recent progress innovating within this layout by changing which tokens occupy the window or how their positions are encoded. The per-head KV layout itself, a dominant contributor to streaming memory and latency, has been mostly left unchanged. In this paper, we present the first study of Multi-Head Latent Attention (MLA) in video diffusion. VideoMLA replaces per-head keys and values with a shared low-rank content latent and a shared decoupled 3D-RoPE positional key, reducing per-token KV memory by ...

Hidir Yesiltepe·22 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Robot manipulation critically depends on perception that preserves the action-relevant aspects of a scene. Yet most robot learning pipelines are built upon visual encoders pre-trained for static recognition or vision-language alignment, leaving motion understanding to downstream policies. We introduce DynaFLIP, a dynamics-aware multimodal pre-training framework that pushes motion understanding upstream into perception. We construct image-language-3D flow triplets from heterogeneous human and robot videos, and use these triplets as training-time supervision to shape an image-only encoder. Our ...

Jusuk Lee·22 days ago

← Front Page30 stories

← Newer Older →