The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

EnvFactory scales tool-use agents via synthetic executable environments and robust RL, addressing data and execution bottlenecks.

Minrui Xu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Distilling Tabular Foundation Models for Structured Health Data

Knowledge distillation transfers tabular foundation models to lightweight models on healthcare data via stratified out-of-fold labeling.

Aditya Tanna·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning Normal Representations for Blood Biomarkers

Personalized blood biomarker interpretation via learned representations accounts for intra-patient variability and baseline deviation.

Aashna P. Shah·1 month ago

MIT Tech Review· PRESS

What to expect from Google this week

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. When Google opens its doors tomorrow for its annual developer conference, I/O, it will do so as a clear third place in the foundation model race. A year ago, at Google I/O…

Grace Huckins·1 month ago

TechCrunch AI· PRESS

Elon Musk has lost his lawsuit against Sam Altman and OpenAI

Elon Musk's claim that he was mistreated by his OpenAI cofounders failed after nine California jurors decided in a unanimous verdict that his lawsuits had been filed too late.

Tim Fernholz·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

PopPy auto-discovers parallelization opportunities in Python compound AI applications to reduce end-to-end latency.

Stephen Mell·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap

Ensemble study of six tabular foundation models reveals high redundancy (Q=0.961) limiting ensemble gains to +0.18% at 253× cost.

Aditya Tanna·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Can Adaptive Gradient Methods Converge under Heavy-Tailed Noise? A Case Study of AdaGrad

AdaGrad and adaptive gradient methods converge under heavy-tailed noise without explicit clipping via implicit noise robustness.

Zijian Liu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SkillGenBench: Benchmarking Skill Generation Pipelines for LLM Agents

SkillGenBench isolates skill generation as a benchmark task for LLM agents, measuring ability to create reusable executable skills from repositories.

Yifan Zhou·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Democratizing Large-Scale Re-Optimization with LLM-Guided Model Patches

Framework uses LLMs as OR experts to dynamically re-optimize operational models via natural-language interaction, addressing real-world constraint evolution.

Tinghan Ye·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Can machine learning for quantum-gas experiments be explainable?

Explores explainability of ML methods for quantum-gas physics experiments, focusing on image denoising and solid identification in cold-atom systems.

I. B. Spielman amd J. P. Zwolak·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Reversa: A Reverse Documentation Engineering Framework for Converting Legacy Software into Operational Specifications for AI Agents

Reversa framework converts legacy software into operational specs via multi-agent pipelines, enabling AI agents to safely modify existing systems.

Sanderson Oliveira de Macedo·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning Quantifiable Visual Explanations Without Ground-Truth

Proposes metric for evaluating XAI methods via continuous input perturbation, assessing sufficiency and necessity without ground-truth labels.

Amritpal Singh·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Lance: Unified Multimodal Modeling by Multi-Task Synergy

Lance: lightweight unified multimodal model using dual-stream MoE architecture for image/video understanding, generation, and editing via multi-task training.

Fengyi Fu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

COOPO: Cyclic Offline-Online Policy Optimization Algorithm

COOPO framework cycles between offline RL training and online fine-tuning to mitigate distributional shift and catastrophic forgetting in hybrid settings.

Qisai Liu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Efficient Lookahead Encoding and Abstracted Width for Learning General Policies in Classical Planning

Improves GNN-based generalized planning policies using efficient lookahead encoding and abstracted width for classical planning domains.

Michael Aichmüller·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Generative AI Advertising as a Problem of Trustworthy Commercial Intervention

Argues generative AI advertising enables undetected commercial intervention in LLM outputs, framing it as trustworthy intervention problem not content placement.

Jingyi Qiu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment

Position paper: safe LLM agent deployment requires three-layer probabilistic assume-guarantee architecture covering semantic, environmental, and dynamical constraints.

S. Bensalem·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Better Together: Evaluating the Complementarity of Earth Embedding Models

Proposes complementarity-based evaluation framework for Earth observation embedding models through fusion rather than isolation.

Thijs L van der Plas·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A No-Defense Defense Against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?

Empirical study showing shallow DNNs with ReLU activation provide adversarial robustness in ML-based network intrusion detection without explicit defenses.

Mohamed elShehaby·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GIM: Evaluating models via tasks that integrate multiple cognitive domains

GIM benchmark of 820 problems evaluates LLMs on integrated multi-domain reasoning grounded in practical contexts, addressing saturation of existing benchmarks.

Rohit Patel·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Efficient and Noise-Tolerant PAC Learning of Multiclass Linear Classifiers

Theoretical PAC learning result: efficient algorithm for learning multiclass linear classifiers under malicious noise with marginal distribution assumptions.

Rita Adhikari·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AI for Auto-Research: Roadmap & User Guide

Comprehensive analysis of AI-assisted research across full lifecycle (Apr 2026): automated systems generate papers cheaply but fabricate results and miss errors.

Lingdong Kong·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

KairosHope: A Next-Generation Time-Series Foundation Model for Specialized Classification via Dual-Memory Architecture

KairosHope time-series foundation model replaces attention with dual-memory HOPE block for specialized classification, integrating statistical knowledge.

Luis Balderas·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning

FedHybrid algorithm balances accuracy, privacy, and communication in differentially private federated learning via improved FedAvg initialization.

Arnab Auddy·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

Distills tabular foundation models into CPU-executable XGBoost/CatBoost via stratified OOF labeling, achieving <2ms latency for fraud detection.

Aditya Tanna·1 month ago

r/singularity· COMMUNITY

Schiff Proposes Bill Requiring Data Centers to Pay for Own Power

Rep. Schiff proposes legislation requiring data centers to cover full electricity costs, targeting AI infrastructure economics and energy demand.

u/SnoozeDoggyDog·1 month ago·102 pts / 32 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

An Assessment of Human vs. Model Uncertainty in Soft-Label Learning and Calibration

Controlled study on MNIST decouples human soft-label benefits from label correction, showing uncertainty captures improve calibration independently.

Maja Pavlovic·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Language-Switching Triggers Take a Latent Detour Through Language Models

Mechanistic analysis of language-switching backdoor in 8B LM: three-phase circuit where Latin trigger redirects English→French via attention and subspace propagation.

Francis Kulumba·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Post-Trained MoE Can Skip Half Experts via Self-Distillation

ZEDA enables post-trained MoE models to skip ~50% of experts via self-distillation, reducing inference cost without retraining.

Xingtai Lv·1 month ago

← Front Page30 stories

← Newer Older →

The Archive

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

Distilling Tabular Foundation Models for Structured Health Data

Learning Normal Representations for Blood Biomarkers

What to expect from Google this week

Elon Musk has lost his lawsuit against Sam Altman and OpenAI

PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap

Can Adaptive Gradient Methods Converge under Heavy-Tailed Noise? A Case Study of AdaGrad

SkillGenBench: Benchmarking Skill Generation Pipelines for LLM Agents

Democratizing Large-Scale Re-Optimization with LLM-Guided Model Patches

Can machine learning for quantum-gas experiments be explainable?

Reversa: A Reverse Documentation Engineering Framework for Converting Legacy Software into Operational Specifications for AI Agents

Learning Quantifiable Visual Explanations Without Ground-Truth

Lance: Unified Multimodal Modeling by Multi-Task Synergy

COOPO: Cyclic Offline-Online Policy Optimization Algorithm

Efficient Lookahead Encoding and Abstracted Width for Learning General Policies in Classical Planning

Generative AI Advertising as a Problem of Trustworthy Commercial Intervention

Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment

Better Together: Evaluating the Complementarity of Earth Embedding Models

A No-Defense Defense Against Gradient-Based Adversarial Attacks on ML-NIDS: Is Less More?

GIM: Evaluating models via tasks that integrate multiple cognitive domains

Efficient and Noise-Tolerant PAC Learning of Multiclass Linear Classifiers

AI for Auto-Research: Roadmap & User Guide

KairosHope: A Next-Generation Time-Series Foundation Model for Specialized Classification via Dual-Memory Architecture

Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

Schiff Proposes Bill Requiring Data Centers to Pay for Own Power

An Assessment of Human vs. Model Uncertainty in Soft-Label Learning and Calibration

Language-Switching Triggers Take a Latent Detour Through Language Models

Post-Trained MoE Can Skip Half Experts via Self-Distillation