The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Controlled study shows external evolution outperforms internal deliberation (p<0.01) for multi-agent constitutional rules in coordination tasks.

Hershraj Niranjani·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Cosine-Gated Adam-Decay: Drop-In Staleness-Aware Outer Optimization for Decoupled DiLoCo

Cosine Gated Adam Decay optimizer improves asynchronous DiLoCo training by scaling stale pseudo-gradients using exponential decay.

Vatsal Shah·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Transfer Learning of Multiobjective Indirect Low-Thrust Trajectories Using Diffusion Models and Markov Chain Monte Carlo

Diffusion models with MCMC accelerate low-thrust spacecraft trajectory design by learning high-quality initial costate distributions.

Jannik Graebner·2 months ago

r/LocalLLaMA· COMMUNITY

Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store

Apple discontinues 256GB M3 Ultra Mac Studio config; Reddit speculation about M5 Ultra memory trends.

u/rotatingphasor·2 months ago·73 pts / 24 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

A Communication-Theoretic Framework for LLM Agents: Cost-Aware Adaptive Reliability

Framework maps LLM reliability techniques (retry, voting, self-consistency) to Shannon coding theory operators as stochastic channel reliability methods.

Hamed Omidvar·2 months ago

r/singularity· COMMUNITY

DeepMind Employee calls out private AI labs: go public, let regular people invest, or admit you're just enriching billionaires

DeepMind employee argues private AI labs should go public or allow retail investment to avoid enriching only billionaires.

u/Neurogence·2 months ago·138 pts / 26 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Personalized Alignment Revisited: The Necessity and Sufficiency of User Diversity

Characterizes user-diversity conditions for O(1) regret and log(1/ε) sample complexity in personalized LLM alignment.

Enoch Hyunwook Kang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Quantum Transfer Learning Shows Improved Robustness in Low-Data Regimes

Empirical study comparing quantum vs. classical transfer learning robustness under reduced training data.

Li-An Lo·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AI Native Asset Intelligence

AI-native security framework structures asset prioritization across cloud/identity/config signals using exposure and business context.

Gal Engelberg·2 months ago

r/ClaudeAI· COMMUNITY

Using Claude to read 100s of dense PDFs

Reddit user describes workflow challenges using Claude for bulk PDF document review and legal complaint triage.

u/redittreader·2 months ago·20 pts / 49 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Contextual Plackett-Luce: An Efficient Neural Model for Probabilistic Sequence Selection under Ambiguity

Contextual Plackett-Luce model handles ambiguous sequence selection by matching multi-modal target distributions despite single-instance supervision.

Noam Mizrachi·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning

Empirical comparison of expert-guided RL methods on continuous control with shared benchmarks, revealing three failure modes missed by isolated evaluations.

Yann Berthelot·2 months ago

r/singularity· COMMUNITY

What other robots from popular media do you want in real life?

Reddit discussion about fictional robots people want in real life; mentions Disneyland R2D2 replica ($20k) and hypothetical T-800 chef.

u/Anen-o-me·2 months ago·118 pts / 20 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain

Fin-Bias benchmark evaluates LLM decision-making in finance under human bias and uncertainty, addressing reliability concerns in financial deployment.

Xiaoyu Hu·2 months ago

r/ClaudeAI· COMMUNITY

Claude improved my agent harness by 40.7% overnight

Community member built Autoharness, a Claude-powered tool that auto-optimizes agent harnesses (prompts, hyperparameters, scoring) via eval-driven iteration, achieving 40.7% improvement.

u/Lucky_Historian742·2 months ago·25 pts / 12 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Token Economics for LLM Agents: A Dual-View Study from Computing and Economics

First comprehensive survey unifying token economics for LLM agents, framing tokens as production factors and analyzing computational-economic trade-offs.

Yuxi Chen·2 months ago

r/ClaudeAI· COMMUNITY

Something has snapped into place with the claude iOS app and I like it

User reports improved Claude iOS app performance enabling multi-agent workflows and sustained productivity on $100 plan without rate limiting.

u/Trekker23·2 months ago·20 pts / 21 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

GRC: Unifying Reasoning-Driven Generation, Retrieval and Compression

GRC unifies generation, retrieval, and compression in single LLM forward pass using meta latent tokens for long-context agentic tasks.

Zhongtao Miao·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation

Dynamic Meta-Metrics proposes source-sentence conditioned weighting for machine translation evaluation across language pairs.

Luke Zhang·2 months ago

Hugging Face· INFRA

"OncoAgent: A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology Clinical Decision Support"

Hugging Face·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Bridging Spectral Operator Learning and U-Net Hierarchies: SpectraNet for Stable Autoregressive PDE Surrogates

SpectraNet combines spectral convolutions with U-Net hierarchy for stable autoregressive PDE surrogates, addressing rollout-error growth.

Enrique Hernández Noguera·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Tale of Two Problems: Multi-Task Bilevel Learning Meets Equality Constrained Multi-Objective Optimization

Extends bilevel optimization to multi-task setting with relaxed lower-level convexity assumptions for modern ML complexity.

Zhiyao Zhang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Character-Level Transformer for Tajik-Persian Transliteration with a Parallel Lexical Corpus

Character-level Transformer for Tajik-to-Persian transliteration with 52K-word parallel corpus from verified lexicographic sources.

Mullosharaf K. Arabov·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Investigating Anisotropy in Visual Grounding under Controlled Counterfactual Perturbations

Studies approximation behavior in visual grounding under mismatched captions via controlled counterfactual perturbations to improve robustness.

Gabriele Lombardo·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Field-Localized Forgery Detection for Digital Identity Documents

FLiD: field-localized forgery detection framework for digital identity documents targeting critical regions rather than full-document processing.

Abhishek Kumar·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Towards Trustworthy Audio Deepfake Detection: A Systematic Framework for Diagnosing and Mitigating Gender Bias

Diagnosis framework identifies acoustic representation bias in audio deepfake detectors (AASIST, Wav2Vec2+ResNet18) beyond training data imbalance.

Aishwarya Fursule·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Constant-Target Energy Matching: A Unified Framework for Continuous and Discrete Density Estimation

Constant-Target Energy Matching (CTEM) unifies density estimation across continuous, discrete, and mixed-variable domains via energy-based framework.

Zhijun Zeng·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

FactoryNet: 51M-point industrial time-series pretraining corpus with S-E-F-C schema enables zero-shot cross-embodiment transfer and anomaly detection.

Karim Othman·2 months ago

r/MachineLearning· COMMUNITY

What is an average publication outcome for an ML PhD? [D]

Reddit discussion on typical publication counts for ML PhD graduates; crowd-sourced meta-analysis of academic output benchmarks.

u/Hope999991·2 months ago·34 pts / 47 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

CauSim: Scaling Causal Reasoning with Increasingly Complex Causal Simulators

CauSim framework scales causal reasoning for LLMs via executable structural causal models (SCMs), converting scarce-label problem into supervised learning.

Nicolás Astorga·2 months ago

← Front Page30 stories

← Newer Older →

The Archive

Internal vs. External: Comparing Deliberation and Evolution for Multi-Agent Constitutional Design

Cosine-Gated Adam-Decay: Drop-In Staleness-Aware Outer Optimization for Decoupled DiLoCo

Transfer Learning of Multiobjective Indirect Low-Thrust Trajectories Using Diffusion Models and Markov Chain Monte Carlo

Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store

A Communication-Theoretic Framework for LLM Agents: Cost-Aware Adaptive Reliability

DeepMind Employee calls out private AI labs: go public, let regular people invest, or admit you're just enriching billionaires

Personalized Alignment Revisited: The Necessity and Sufficiency of User Diversity

Quantum Transfer Learning Shows Improved Robustness in Low-Data Regimes

AI Native Asset Intelligence

Using Claude to read 100s of dense PDFs

Contextual Plackett-Luce: An Efficient Neural Model for Probabilistic Sequence Selection under Ambiguity

When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning

What other robots from popular media do you want in real life?

Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain

Claude improved my agent harness by 40.7% overnight

Token Economics for LLM Agents: A Dual-View Study from Computing and Economics

Something has snapped into place with the claude iOS app and I like it

GRC: Unifying Reasoning-Driven Generation, Retrieval and Compression

Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation

"OncoAgent: A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology Clinical Decision Support"

Bridging Spectral Operator Learning and U-Net Hierarchies: SpectraNet for Stable Autoregressive PDE Surrogates

A Tale of Two Problems: Multi-Task Bilevel Learning Meets Equality Constrained Multi-Objective Optimization

Character-Level Transformer for Tajik-Persian Transliteration with a Parallel Lexical Corpus

Investigating Anisotropy in Visual Grounding under Controlled Counterfactual Perturbations

Field-Localized Forgery Detection for Digital Identity Documents

Towards Trustworthy Audio Deepfake Detection: A Systematic Framework for Diagnosing and Mitigating Gender Bias

Constant-Target Energy Matching: A Unified Framework for Continuous and Discrete Density Estimation

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

What is an average publication outcome for an ML PhD? [D]

CauSim: Scaling Causal Reasoning with Increasingly Complex Causal Simulators