Vol. I · No. 61FRI, JUN 19, 2026
Archive

The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

How Optimality Structures Sparse Dictionaries: A Theory for Understanding SAE Representations

Sparse Autoencoders (SAEs) have found success parsing neural representations into interpretable concepts, providing a basis for understanding and control. However, what exactly SAEs extract, and, correspondingly, the scientific conclusions we can draw from them, are not obvious. Empirically, the proof is in the pudding: SAEs learn interpretable features. Theoretically, we lack a clear account of what properties a 'concept' must satisfy for an SAE to extract it. There has been extensive identifiability work studying the conditions under which sparse coding recovers ground-truth features; howev...

·

TabPrep: Closing the Feature Engineering Gap in Tabular Benchmarks

Progress in tabular machine learning has largely focused on increasingly sophisticated model architectures. At the same time, feature engineering remains a critical yet underexplored component of real-world modeling pipelines that is entirely absent from modern benchmarks, which creates an unquantified evaluation gap. In this work, we introduce TabPrep, a lightweight preprocessing pipeline composed of feature generators that are carefully designed to target three specific structural data patterns. We show that many widely used model classes exhibit predictable blind spots to these patterns an...

·

A Mathematical Conflict Framework for Contextual Data Modulation

In this study, a generalized operator-based mathematical conflict framework is presented to explicitly represent structural discrepancies between raw data and contextual data. The proposed structure treats conflict as a local, directional, and context-sensitive quantity, integrating components such as weighting, scale behavior, and output mapping under a unified abstract operator. Without being reduced to a specific learning algorithm or optimization method, the framework is defined as a general structure adaptable to different classes of problems. While existing approaches typically treat co...

·

Microsoft to unveil new AI models and Windows improvements at Build

Microsoft is heading to San Francisco this week in a bid to win back developers at its Build conference. I've been attending Build since the days when Microsoft called it the Professional Developers Conference, and I can't remember a more pivotal moment. As Microsoft continues to reshuffle its entire business around AI, it's moving Build into a smaller, more intimate venue. Trust in Windows and GitHub is at an all-time low, and this is Microsoft's chance to reconnect with developers and outline the future. Sources tell me that we'll hear about new AI models in Windows, a new reasoning model f...

·

AI is blowing up music. How should the Grammys handle it?

Today I’m talking with Harvey Mason Jr., who is CEO of the Recording Academy — that’s the outfit that puts on the Grammy Awards. I last talked to Harvey in 2024, when it was obvious that generative AI would upend the music industry, but still not exactly clear how that would happen. Well, it’s been 18 months since that conversation, and you’re going to hear Harvey say that AI is now “omnipresent” in music production. And Harvey knows what he’s talking about — he is himself a legendary producer who’s worked with everyone from Janet Jackson to Beyoncé. Harvey has said that every session he’s be...

·

Strava blames zero-code AI apps and scrapers as it tightens API access

The popular fitness-tracking platform, Strava, is restricting access to its API as part of efforts to clamp down on AI scraping, as reported earlier by TechCrunch. Developers who want to build an app using Strava's data now need to pay for a flat $11.99 / month subscription. In an update on its developer hub, Strava blames the change on "zero-code AI tools" that allow users to quickly create apps that "hammer" APIs. "We have felt this firsthand - developer applications to our program are up 448% year-to-date, API intermediaries have violated policy terms, and scraping attempts have degraded p...

·

OpenAI frontier models and Codex are now available on AWS

OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS environments, controls, and procurement workflows they already use. Customers can get started with OpenAI on AWS and move faster from evaluation to production.

·

How to Post-Train Autonomous Vehicle Models in Closed-Loop with NVIDIA Alpamayo

Developing autonomous vehicle (AV) policies requires bridging an important gap between training and deployment. Vision-language-action (VLA) models that can... Developing autonomous vehicle (AV) policies requires bridging an important gap between training and deployment. Vision-language-action (VLA) models that can reason over more complex driving scenes and produce richer intermediate reasoning are predominantly trained in open-loop, where model outputs are directly compared to ground-truth behaviors without considering their effect on the environment. Source

·

May 2026 newsletter

I just sent out the May edition of my sponsors-only monthly newsletter . If you are a sponsor (or if you start a sponsorship now) you can access it here . This month: Al got expensive, and Anthropic had a really good month The model releases were a little disappointing Conferences and podcasts I launched Datasette Agent and made a lot of progress on Datasette What I'm using, May 2026 edition Miscellaneous extras Here's a copy of the April newsletter as a preview of what you'll get. Pay $10/month to stay a month ahead of the free copy! Tags: newsletter

·

Develop Physical AI Reasoning, World, and Action Models with NVIDIA Cosmos 3

Physical AI systems must understand the real world before they can act within it. Robots, autonomous vehicles, and smart spaces need to understand what's... Physical AI systems must understand the real world before they can act within it. Robots, autonomous vehicles, and smart spaces need to understand what’s happening in their world, predict what’s likely to happen next, and generate actions for specific environments, embodiments, and tasks. NVIDIA Cosmos 3 is a frontier foundation model for physical AI that combines physical reasoning… Source

·

Advancing AI Infrastructure for Agentic AI with NVIDIA DOCA In-Silicon Security

The AI era is driving a new class of infrastructure: AI factories that transform data into intelligence for autonomous AI agents operating at unprecedented... The AI era is driving a new class of infrastructure: AI factories that transform data into intelligence for autonomous AI agents operating at unprecedented scale. Powered by accelerated computing, AI factories enable enterprises to train, fine-tune, and deploy AI with greater speed and efficiency. This new class of infrastructure also introduces a fundamentally new attack surface spanning… Source

·

NVIDIA Vera CPU Sets a New Standard for Agentic Workloads in AI Factories

Each wave of AI has created a new scaling law. Pretraining scaled intelligence through larger datasets, more parameters, and massively parallel GPU systems.... Each wave of AI has created a new scaling law. Pretraining scaled intelligence through larger datasets, more parameters, and massively parallel GPU systems. Post-training scaled usefulness through instruction tuning, and re-balancing GPUs for generative inference. Test-time scaling improved reasoning by giving models more generated tokens for thinking. Now, agentic AI and reinforcement… Source

·

NVIDIA DSX OS Delivers Open, Modular Software for Operating AI Factories at Scale

AI is now essential infrastructure, powered by AI factories that generate intelligence in the form of tokens. As demand grows, these factories must scale... AI is now essential infrastructure, powered by AI factories that generate intelligence in the form of tokens. As demand grows, these factories must scale faster, operate more efficiently, and lower the cost of intelligence across the five-layer stack: energy, chips, infrastructure, models, and applications. NVIDIA DSX platform provides the complete playbook for designing, simulating, building… Source

·

datasette 1.0a32

Release: datasette 1.0a32 A minor bugfix release. Fixes a bug with INSERT ... RETURNING queries via the new /db/-/execute-write endpoint and a bunch of base_url issues which showed up when I was experimenting with Service Workers yesterday. Tags: datasette , annotated-release-notes

·

The solution might be cancelling my AI subscription

The solution might be cancelling my AI subscription I find this post by David Wilson very relatable. David lists 16+ projects he's spun up with AI tooling, and concludes: I didn't mean to build most of these things. Usually the Claude session started with something like " write a quick script for X ", and one hour later the result is not a quick script for X , nor in the usual case is my problem solved, whatever the original itch happened to be. On that last point, this technology is horrific for attention. It's a thermonuclear ADHD amplifier and I have seen the same effect in every single on...

·
30 stories