The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Structured Coupling for Flow Matching

SCFM combines flow matching with structured latent variables to learn interpretable generative models with better quality than unstructured baselines.

Xavier Sumba·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FactoryBench: Evaluating Industrial Machine Understanding

FactoryBench evaluates LLMs and time-series models on industrial robotic telemetry using Pearl's causal ladder and LLM-as-judge scoring.

Yanis Merzouki·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Differentially Private Auditing Under Strategic Response

Differential privacy audits of AI systems fail against strategic developers; bilevel game analysis shows non-affine approval functions are necessary but enable evasion.

Florian A. D. Burnat·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting

Autonomous agent oversight reveals endogeneity: non-affine approval functions needed to screen dishonest agents violate truthful reporting conditions.

Lauri Lovén·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Debiased Counterfactual Generation via Flow Matching from Observations

Flow matching for counterfactual generation exploits shared support and tail behavior between observational and intervention distributions.

Hugh Dance·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Quotient Semivalues for False-Name-Resistant Data Attribution

Quotient semivalue mechanism defends Shapley/Banzhaf data valuation against false-name manipulation via pseudonymous identity clustering.

Florian A. D. Burnat·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Direction-Preserving Number Representations

Geometric framework analyzes directional accuracy of low-precision number representations in ML vector operations.

Bardia Zadeh·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Stochastic Transition-Map Distillation for Fast Probabilistic Inference

STMD distills diffusion model transition maps for faster probabilistic inference without teacher supervision.

George Rapakoulias·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Not All Tokens Learn Alike: Attention Entropy Reveals Heterogeneous Signals in RL Reasoning

Attention entropy analysis reveals token-level RL post-training redundancy and heterogeneous learning signals in LLM reasoning.

Gengyang Li·2 months ago

OpenAI· FRONTIER

Running Codex safely at OpenAI

OpenAI documents sandboxing, approvals, network policies, and telemetry for safe Codex deployment in agent workflows.

OpenAI·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Towards Billion-scale Multi-modal Biometric Search

Bharat ABIS describes billion-scale multimodal biometric search system for national identity using fingerprint, face, iris matching.

Arka Koner·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Reliable Chain-of-Thought via Prefix Consistency

Prefix consistency weights CoT traces by regeneration stability, improving LLM reasoning accuracy without log-probability access.

Naoto Iwase·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Operating Within the Operational Design Domain: Zero-Shot Perception with Vision-Language Models

Vision-language models enable zero-shot ODD perception for autonomous systems compliance with safety-critical regulations.

Berkehan Ünal·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning Large-Scale Modular Addition with an Auxiliary Modulus

Covariate-shift-free method scales modular addition learning via auxiliary modulus without training-test distribution mismatch.

Hanato Kikuchi·2 months ago

The Verge AI· PRESS

Nanoleaf bets its future on robots, red light therapy, and AI

Nanoleaf teased a trio of new products focused on embodied AI as it looks to move its brand beyond smart lighting. | Image: Nanoleaf Smart lighting company Nanoleaf has been unusually quiet recently. While competitors such as Govee and Philips Hue have been pumping out new products and innovative features at an impressive pace, Nanoleaf has launched just a handful of smart lighting products in the last two years. There's a reason for this lull - the company has been going through a "brand evolution" focused on wellness, robotics, and, of course, AI. "The smart home is getting kind of boring,"...

Jennifer Pattison Tuohy·2 months ago

r/Anthropic· COMMUNITY

Do You Scramble to Finish Your Usage Limits?

Reddit user discusses token usage limits and mentions Opus 4.7 as solution for exhausting remaining quota before reset.

u/2ndL·2 months ago·10 pts / 11 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation

Few-shot LLM scoring (GPT-5.2) on short answers shows mid-range degradation on partial-credit responses without task-specific adaptation.

Abigail Victoria Gurin Schleifer·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MAVEN: Multi-Agent Verification-Elaboration Network with In-Step Epistemic Auditing

MAVEN multi-agent framework adds in-step epistemic verification and adversarial skepticism to LLM reasoning chains for high-stakes tasks.

Yinsheng Yao·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation

LithoBench introduces first domain-specific benchmark for evaluating multimodal LLMs on geological lithology interpretation from remote-sensing imagery.

Jun Wang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Tacit Knowledge Extraction via Logic Augmented Generation and Active Inference

Paper proposes logic augmentation and active inference for extracting tacit procedural knowledge into machine-interpretable representations.

Lorenzo Lamazzi·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

Decentralized multi-agent pathfinding solver using local communication and learned coordination for scalable multi-robot trajectory planning.

Valeriy Vyaltsev·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Multi-Dimensional Evaluation of LLMs for Grammatical Error Correction

Benchmark evaluates latest LLMs on grammatical error correction across edit precision, fluency, and meaning retention with reference-free metrics.

Adnan Labib·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Robust stochastic first order methods in heavy-tailed noise via medoid mini-batch gradient sampling

Stochastic first-order optimization method using medoid mini-batch gradient sampling for heavy-tailed noise without explicit clipping.

Manojlo Vukovic·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Post-training makes large language models less human-like

Psych-201 dataset reveals post-training reduces LLM behavioral alignment with humans, with divergence widening in newer model generations.

Marcel Binz·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Inference Time Causal Probing in LLMs

HDMI: probe-free causal intervention method steers LLM hidden states via gradient-based margin maximization without auxiliary classifiers.

Sadegh Khorasani·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents

PhoneSafety benchmark (700 examples) distinguishes genuine safety understanding from task failure in phone-use agents via fine-grained outcome categorization.

Zhengyang Tang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Is She Even Relevant? When BERT Ignores Explicit Gender Cues

Checkpoint-level analysis of gender bias formation in Dutch BERT trained from scratch, tracing emergence of morphological gender information.

Jonas Klein·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Intent-Driven Semantic ID Generation for Grounded Conversational News Recommendation

Intent-driven Semantic ID generation for conversational news recommendation bridges implicit user intents unaddressable by standard RAG pipelines.

Hongyang Su·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Nürnberg NLP at PsyDefDetect: Multi-Axis Voter Ensembles for Psychological Defence Mechanism Classification

Ensemble methods for psychological defence mechanism classification via orthogonal voter axes; shared task at BioNLP 2026.

Philipp Steigerwald·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SAM 3D Animal: Promptable Animal 3D Reconstruction from Images in the Wild

SAM 3D Animal: multi-animal 3D reconstruction from single images using SMAL+ parametric model and prompt-based disambiguation.

Xuyi Hu·2 months ago

← Front Page30 stories

← Newer Older →