The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Large language models systematically hallucinate legal citations -- fabricating statute references, citing repealed provisions, and confusing jurisdictions -- yet no automated method exists to measure or reduce this behavior at scale. We propose citation grounding (CG), a metric that verifies LLM-generated legal citations against a ground-truth citation graph extracted from 100.8 million Ukrainian court decisions (502 million edges, 21,736 unique statute nodes). CG decomposes into three components -- citation precision (does the cited provision exist?), citation relevance (is it contextually ...

Volodymyr Ovcharov·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Tiny Recursive Models for Solving the J2-Perturbed Lambert Problem

This paper presents a fast, recursive neural solver for the J2-perturbed Lambert problem based on Tiny Recursive Models (TRM), termed the TRM-Perturbed Lambert (TRM-PL) model. TRM is a weight-shared architecture whose effective capacity emerges from iteration depth rather than parameter count: a compact reasoning module is applied repeatedly within a two-level latent hierarchy, refining a candidate departure velocity by simulating the J2 trajectory and correcting it from the resulting tracking error. This unifies initial-guess generation and iterative correction in a single, end-to-end differ...

Minduli Wijayatunga·20 days ago

Simon Willison· ANALYST

Running Python ASGI apps in the browser via Pyodide + a service worker

Research: Running Python ASGI apps in the browser via Pyodide + a service worker Datasette Lite is my version of Datasette that runs entirely in the browser using Pyodide in WebAssembly. When I first built it four years ago I used Web Workers and code that intercepts navigation operations and fetches the generated HTML by running the Python app. This worked, but had the disadvantage that any JavaScript in <script> tags would not be executed - breaking some Datasette functionality and a whole lot of Datasette plugins. This morning I set Claude Opus 4.8 the task (in Claude Code for web) of figu...

Simon Willison·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

An Exploratory Study into using Machine-Learning for Fast Step-by-step Emulation of Numerical Mechanical Thrombectomy Simulations for Ischemic Stroke

The treatment of ischemic stroke using mechanical thrombectomy involves difficult decisions under intense time constraints. Numerical physics simulations can in theory inform operators to make better decisions regarding treatment approaches and device selection, but are too slow to do so in practice. In this thesis, we investigate if current machine learning based surrogates can accurately emulate these simulations in a step-by-step manner while making them significantly faster. To do this we train three surrogate models on two simulations that involve a simplified aspiration procedure, with ...

Thijs Stessen·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Lightweight Hybrid MLP-Based Framework for Real-Time Phishing URL Detection Using Structural URL Features

Phishing attacks remain a major cybersecurity threat, exploiting deceptive URLs to steal sensitive user information. Traditional blacklist and rule-based detection approaches are reactive and often fail to identify newly emerging phishing URLs. This paper proposes a lightweight hybrid framework for real-time phishing URL detection that combines blacklist-based screening with a Multi-Layer Perceptron (MLP) classifier operating solely on structural URL features. The framework extracts 16 URL-derived features capturing structural, domain-based, and security-related characteristics without requir...

Uche Unoke Emmanuel·20 days ago

Simon Willison· ANALYST

I Am Retiring from Tech to Live Offline

I Am Retiring from Tech to Live Offline I've seen a lot of posts on forums from people threatening to quit their careers over AI. This is not one of those: Chad Whitacre is taking concrete steps, starting with this typewritten, scanned letter I'm retiring from tech. Well, "retiring" is euphemistic. I'm stepping away from tech, and that includes Open Source. [...] AI was the last straw. Have you heard of that island off India where the indigenous population kills any outsiders fool-hardy enough to land? They are doing the rest of us a favor by preserving a way of life we may need again someday...

Simon Willison·20 days ago

Simon Willison· ANALYST

Quoting Daniel Jalkut

My take on AI is, essentially, everybody who’s against it is too against it and everybody who’s for it is too for it. — Daniel Jalkut , via John Gruber Tags: ai

Simon Willison·20 days ago

TechCrunch AI· PRESS

‘What a joke’: Github Copilot’s new token-based billing spurs consternation among devs

The golden age of Microsoft's Github Copilot appears to be at an end.

Lucas Ropek·20 days ago

TechCrunch AI· PRESS

Meta is reportedly developing an AI pendant

Meta seems to be making big bets on AI-powered hardware.

Anthony Ha·20 days ago

TechCrunch AI· PRESS

I put Google’s 24/7 AI assistant Gemini Spark to work, and it’s actually pretty useful

Gemini Spark helps automate everyday tasks, from inbox summaries to local event planning, but it’s unclear why Google made it a separate product.

Sarah Perez·20 days ago

TechCrunch AI· PRESS

The groupthink boom: what 3 top VCs really think about the AI frenzy

"If you're 22 years old in San Francisco and building something in AI, there may be a seed term sheet in your inbox — but if you're 19, oh my God, this means you're really good; you might already have a Series A [offer]," said one, half-kiddingly.

Connie Loizos·20 days ago

TechCrunch AI· PRESS

As the browser wars heat up, here are the hottest alternatives to Chrome and Safari in 2026

We’ve compiled an overview of some of the top alternative browsers available today aiming to challenge Chrome and Safari.

Lauren Forristal·20 days ago

The Verge AI· PRESS

How one founder’s bet on ‘the old school web’ is paying off

A good time with old maps. | Image: Past Maps Craig Campbell walked away from the river of investor money flowing into AI to create, of all things, a website. Sure, Campbell probably could have started an AI company. He's a former engineer at Meta and an experienced tech founder who in 2022 sold his last venture - an e-commerce tool for businesses that use Shopify - right as the AI boom was booming. "I had my prior VC investors breathing down my neck, going 'start something else. We'll write you a blank check.'" He had other ideas. People generally aren't rushing to get into the website busin...

Allison Johnson·20 days ago

The Verge AI· PRESS

AI grifters are creating fake Black people to sell Shein junk

TikTok sellers that appear to be AI generated, in tears. | The Verge Aliyah, a light-skinned Black woman dressed in country-western gear, is struggling to sell metal buckles she handmade on TikTok. In a video for the social media platform from March, she cries to the camera and pleads for views: "Even as a black woman, I have more faith that white women will stay 13 seconds [on this video] to save my belt buckle business," the onscreen text reads. She wipes a tear off her cheek. But Aliyah isn't real, and neither are her supposedly handmade products - she's one of many AI-generated influencer...

Nicole Froio·20 days ago

The Verge AI· PRESS

The SpaceX IPO is great for Elon Musk and terrible for you

Number go up? | Image: Cath Virginia / The Verge, Getty Images I haven't seen anything as stupid as the WeWork IPO document in a very long time - that is, until Elon Musk filed to take SpaceX public. WeWork was a joke. SpaceX is a threat. And if Musk and his bankers have their way, you are going to be their bagholder. Lots of the top-line details leaked long before the S-1 filing itself became public. There's the rumored valuation of more than $1 trillion. That's despite the nearly $5 billion in losses last year. The total addressable market (TAM) for SpaceX - the amount of revenue SpaceX thi...

Elizabeth Lopatto·20 days ago

Latent Space· ANALYST

[AINews] Founders and Forward Deployed Engineers

a quiet day lets us highlight the new AIE WF focuses

Latent Space·21 days ago

NVIDIA Dev Blog· INFRA

DynoSim: Simulating the Pareto Frontier

Modern LLM serving is hard to tune because each deployment is a stack of interacting choices: model backend, tensor-parallel shape, prefill/decode split, worker... Modern LLM serving is hard to tune because each deployment is a stack of interacting choices: model backend, tensor-parallel shape, prefill/decode split, worker counts, scheduler settings, routing policy, KV cache behavior, autoscaling thresholds, and topology. Those choices interact across layers, and a local improvement can shift the bottleneck somewhere else. For larger models… Source

Yongming Ding·21 days ago

TechCrunch AI· PRESS

Coders are refusing to work without AI — and that could come back to bite them

While AI is helping coders produce code faster, it may not be producing better code, researchers warn. And that could cause problems down the road for them.

Julie Bort·21 days ago

Google AI (Gemma)· FRONTIER

Take our I/O 2026 quiz, vibe coded in Google AI Studio.

We used Google AI Studio to vibe code a quiz about our top I/O 2026 announcements.

{"$":{"xmlns:author":"http://www.w3.org/2005/Atom"},"name":["Zahra Thompson"],"title":["Contributor"],"department":["The Keyword"],"company":[""]}·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models

Connector-based video unified models have demonstrated strong capability in instruction-grounded video synthesis, but integrating a large high-fidelity generator into the unified training loop is computationally prohibitive, limiting achievable visual quality. We therefore propose Lumos-Nexus, a training-efficient unified video generation framework that facilitates the development of strong reasoning-driven generation capabilities while significantly enhancing visual fidelity. Lumos-Nexus adopts a two-stage design: 1) During training, only a lightweight generator is aligned with the understan...

Jiazheng Xing·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

KLIP: localized distribution shift detection via KL-divergence with diffusion priors in Inverse Problems

Diffusion models have shown promising performance as data-driven priors for computational imaging, as well as some capacity to detect out-of-distribution (OOD) images. However, existing approaches to OOD detection often require some knowledge of the shifted distribution, fail to detect subtle or localized distribution shifts, and operate on full images, rather than the indirect measurements available in inverse problems. We propose an OOD detection metric based on the Kullback-Leibler divergence between the diffusion prior and the posterior distribution, that (i) does not require any calibrat...

Alireza Kheirandish·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Tight Theory of Error Feedback Algorithms in Distributed Optimization

Communication costs are a major bottleneck in distributed learning and first-order optimization. A common approach to alleviate this issue is to compress the gradient information exchanged between agents. However, such compression typically degrades the convergence guarantees of gradient-based methods. Error feedback mechanisms provide a simple and computationally cheap remedy for this issue, but numerous variants have been proposed, and their relative performance remains poorly understood. This paper provides tight convergence analyses for two of the main error-feedback algorithms from the l...

Daniel Berg Thomsen·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Stateful Online Monitoring Catches Distributed Agent Attacks

Language models can find thousands of severe software vulnerabilities, and agents are increasingly being misused for cyberattacks. To avoid detection, attackers frequently distribute their misuse, splitting a harmful task across many user accounts so each individual transcript looks benign. Because safety monitors score only one agent context at a time, they are structurally blind to misuse that is only visible in aggregate, across many accounts. We show this gap is real by building, to our knowledge, the first distributed agent attack, a multi-agent scaffold that completes hard cybersecurity...

Davis Brown·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TunerDiT: Training-free Progressive Steering of Diffusion Transformer for Multi-Event Video Generation

Text-to-video (T2V) generation faces challenging questions when generating videos with long horizons containing multiple events. Inspired by the intrinsics of the diffusion process, we probe video diffusion transformers (DiTs) and uncover intrinsic turning points in the DiT denoising trajectory where conditioning text affects generation from global layout to fine-grained details. Building on this finding, we present TunerDiT, a simple yet effective progressive steering method that requires no additional training for multi-event generation. TunerDiT comprises two steering handles: (1) Event-Pa...

Ruotong Liao·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

Grasping the semantics of rare constructions (form-meaning pairings) has been shown to be a challenging problem that has currently only been solved by the largest LLMs. It remains an open question if open-source models have robust constructional understanding, and if so, what learning dynamics underlie the acquisition of this knowledge. Focusing on a set of rare Paired-Focus constructions in English (e.g. "let alone", "much less"), we construct a novel dataset to test their meanings using both scalar adjectival semantics and general world knowledge. Testing a wide range of models differing in...

Wesley Scivetti·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Long-context reasoning remains a central challenge for large language models, which often fail to locate and integrate key information in extensive distracting content. Reinforcement learning with verifiable rewards (RLVR) has shown promise for this task, yet existing methods are limited by low-confusability distractors and sparse, outcome-only reward signals that cannot supervise intermediate reasoning steps. To address these issues, we introduce \textsc{LongTraceRL}. For data construction, we generate multi-hop questions via knowledge graph random walks and leverage search agent trajectorie...

Nianyi Lin·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Choosing the Lens: Strategic Perspective Activation in Context-Dependent Argumentation

The same arguments often need to be evaluated under different external regimes. An agent with influence over the regime has a strategic lever that standard formalisms do not directly capture. We introduce context-dependent argumentation frameworks (CDAFs), an extension of Dung's theory in which a defeat function determines, per context, which attacks succeed. A perspective-labeled specialisation derives the defeat function from a relevance set $ρ$ and a priority $π$. The relevance set is the agent's action space. In a small worked example, the agent's target argument is rejected under every f...

Albert Sadowski·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings

Transformer-based architectures have advanced sequence modeling in language and vision, yet general-purpose representation learning for heterogeneous multivariate time series remains underexplored. We introduce CHARM (Channel-Aware Representation Model), which incorporates channel-level textual descriptions into a Transformer encoder equivariant to channel order. CHARM is trained with a Joint Embedding Predictive Architecture (JEPA) and a novel loss promoting informative, temporally stable embeddings; latent-space prediction encourages robustness to sensor noise while description-aware gating...

Utsav Dutta·21 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics

Scalable information retrieval testing needs corpora that are large enough to stress index construction, ranking latency, query routing, and evaluation tooling, yet human-judged test collections remain expensive and may be unavailable when documents are private or still under design. This paper introduces SPECTRA, a reproducible framework for generating synthetic text corpora and retrieval test collections through a separation of latent topical structure, surface text realization, metadata controls, query intent generation, and deterministic relevance oracles. The framework is intended as a d...

Eric Liang·21 days ago

The Verge AI· PRESS

Tech companies desperately want to film you doing chores

This week, an AI training startup called Shift said it would clean New Yorkers' homes for free. It has plans to expand into other cities as well, including London, and looking around my flat, I get the appeal. But there's a catch. There's always a catch. In exchange for the cleaning, Shift wants footage of its cleaners at work: scrubbing dishes, wiping counters, dusting tables, mopping floors. It wants everything. Video of all the boring domestic labor we'd happily outsource if we could - and that robotics companies are racing to teach machines to do so they can sell us something to do it for...

Robert Hart·21 days ago

← Front Page30 stories

← Newer Older →

The Archive

Citation Grounding: Detecting and Reducing LLM Citation Hallucinations via Legal Citation Graphs

Tiny Recursive Models for Solving the J2-Perturbed Lambert Problem

Running Python ASGI apps in the browser via Pyodide + a service worker

An Exploratory Study into using Machine-Learning for Fast Step-by-step Emulation of Numerical Mechanical Thrombectomy Simulations for Ischemic Stroke

A Lightweight Hybrid MLP-Based Framework for Real-Time Phishing URL Detection Using Structural URL Features

I Am Retiring from Tech to Live Offline

Quoting Daniel Jalkut

‘What a joke’: Github Copilot’s new token-based billing spurs consternation among devs

Meta is reportedly developing an AI pendant

I put Google’s 24/7 AI assistant Gemini Spark to work, and it’s actually pretty useful

The groupthink boom: what 3 top VCs really think about the AI frenzy

As the browser wars heat up, here are the hottest alternatives to Chrome and Safari in 2026

How one founder’s bet on ‘the old school web’ is paying off

AI grifters are creating fake Black people to sell Shein junk

The SpaceX IPO is great for Elon Musk and terrible for you

[AINews] Founders and Forward Deployed Engineers

DynoSim: Simulating the Pareto Frontier

Coders are refusing to work without AI — and that could come back to bite them

Take our I/O 2026 quiz, vibe coded in Google AI Studio.

Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Models

KLIP: localized distribution shift detection via KL-divergence with diffusion priors in Inverse Problems

A Tight Theory of Error Feedback Algorithms in Distributed Optimization

Stateful Online Monitoring Catches Distributed Agent Attacks

TunerDiT: Training-free Progressive Steering of Diffusion Transformer for Multi-Event Video Generation

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Choosing the Lens: Strategic Perspective Activation in Context-Dependent Argumentation

Giving Sensors a Voice: Multimodal JEPA for Semantic Time-Series Embeddings

SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics

Tech companies desperately want to film you doing chores