The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Apple has agreed to pay $250 million to settle a class action lawsuit for overpromising the arrival of Siri's AI features.

Lauren Forristal·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learned Neighbor Trust for Collaborative Deployment in Model-Agnostic Decentralized Learning

Decentralized learning framework where heterogeneous nodes train learned neighbor-trust policies for collaborative inference deployment in IoT.

Michael Lanier·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Scalable inference of spatial regions and temporal signatures from time series

Spatial regionalization method using minimum description length principle to partition time-evolving domains without pre-specifying region count.

Jiayu Weng·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Uno-Orchestra: Parsimonious Agent Routing via Selective Delegation

Uno-Orchestra: unified LLM multi-agent orchestration policy that jointly learns task decomposition and worker selection via RL, benchmarked on 13 suites.

Zhiqing Cui·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Misaligned by Reward: Socially Undesirable Preferences in LLMs

Reward models fail to capture socially desirable preferences across bias, safety, morality, and ethics—exposing hidden alignment failures in LLM training.

Gayane Ghazaryan·2 months ago

r/Anthropic· COMMUNITY

I paused, went to eat, took shower, 1 prompt later, 45% (8 mins into a new session)

I was working on a project, I got hungry went to eat and take a shower while also having this be my break, came back, session was at 0%, typed to claude that the animation of the CSS needs to be slower and more subtle, he changed it, 45% usage. Nowhere did it warn me that possibly cache was cold or that I would be consuming a lot of tokens to CONTINUE a chat that I didn't close on the same PC. So now I have to slow down my work and wait for this 5 hour cycle to end to properly speed up my progress.

u/PaP3s·2 months ago·16 pts / 5 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Agentic Vulnerability Reasoning on Windows COM Binaries

SLYP agent discovers Windows COM privilege-escalation race conditions via agentic binary exploration and generates debugger-verified proof-of-concept exploits.

Hwiwon Lee·2 months ago

TechCrunch AI· PRESS

Ethos raises $22.75M from a16z for its expert network with voice onboarding

Ethos says it is onboarding 35,000 experts per week

Ivan Mehta·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation

Fine-tuning study on 25M-parameter transformer for jazz chord generation—domain adaptation via pop-to-jazz transfer learning.

Jinju Lee·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion

DualTCN physics-constrained TCN for marine electromagnetic inversion achieves 25% loss reduction over baselines.

Khaled Ahmed·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

Theoretical analysis shows adaptive agentic queries don't outperform fixed in-context queries under ReLU realizability constraints.

Anastasis Kratsios·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Federated Learning for Early Prediction of EV Charging Demand

Accurate forecasting of electric vehicle (EV) charging demand is critical for grid stability, infrastructure planning, and real-time charging optimization. In this work, we study the problem of early prediction of charging demand, where the total energy of a session is estimated using only information available at plug-in time and during the first minutes of charging. This enables actionable decisions while the session is still in progress, which is of direct importance for EV network operators. We construct a session-level dataset from the Adaptive Charging Network (ACN), combining session m...

Vasilis Perifanis·2 months ago

r/LocalLLaMA· COMMUNITY

None of this will ever get stolen

Reddit comment expressing skepticism about outdoor infrastructure installation due to theft concerns.

u/martin_xs6·2 months ago·65 pts / 77 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Self-Induced Outcome Potential: Turn-Level Credit Assignment for Agents without Verifiers

Long-horizon LLM agents depend on intermediate information-gathering turns, yet training feedback is usually observed only at the final answer, because process-level rewards require high-quality human annotation. Existing turn-level shaping methods reward turns that increase the likelihood of a gold answer, but they require answer supervision or stable task-specific verifiers. Conversely, label-free RL methods extract self-signals from output distributions, but mainly at the answer or trajectory level and therefore cannot assign credit to intermediate turns. We propose Self-Induced Outcome Po...

Senkang Hu·2 months ago

Anthropic· FRONTIER

Higher usage limits for Claude and a compute deal with SpaceX

Anthropic raises Claude usage limits and partners with SpaceX for compute infrastructure to expand capacity.

Anthropic·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Conceptors for Semantic Steering

Activation-based steering provides control of LLM behavior at inference time, but the dominant paradigm reduces each concept to a single direction whose geometry is left largely unexamined. Rather than selecting a single steering direction, we use conceptors: soft projection matrices estimated from activations pooled across both poles of a bipolar concept, which preserve the concept's full multidimensional subspace. A geometric analysis shows the bipolar subspace strictly subsumes the single-vector baseline. We further show that the conceptor quota provides a parameter-free layer-selection di...

Ilias Triantafyllopoulos·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

A Tree Markov Decision Problem (T-MDP) is a finite-horizon MDP with a starting state $s_{1}$, in which every state is reachable from $s_{1}$ through exactly one state-action trajectory. T-MDPs arise naturally as abstractions of decision making in sequential games with perfect recall, against stationary opponents. We consider the problem of on-line learning in T-MDPs, both in the PAC and the regret-minimisation regimes. We show that well-known bandit algorithms -- \textsc{Lucb} and \textsc{Ucb} -- can be applied on T-MDPs by treating each policy as an arm. The apparent technical challenge in t...

Anvay Shah·2 months ago

TechCrunch AI· PRESS

At TechCrunch Disrupt 2026, all your M&A questions will be answered

Leaders from Coinbase, M13, and Mignano Law Group talk about how M&A is an early-stage strategy at TechCrunch Disrupt 2026. Register to hear this live.

TechCrunch Events·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Architectural Constraints Alignment in AI-assisted, Platform-based Service Development

AI-assisted development tools enable rapid prototyping of services but often lack awareness of architectural constraints, infrastructure dependencies, and organizational standards required in production environments. Consequently, generated artifacts may exhibit brittle behavior and limited deployability. We propose a retrieval-augmented scaffolding approach that combines platform-based code generation with agentic clarification loops to expose and resolve architectural constraint ambiguities. By combining template retrieval with structured interaction, the method embeds production-relevant c...

Julius Irion·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Why Expert Alignment Is Hard: Evidence from Subjective Evaluation

Study shows expert alignment in LLMs varies substantially by evaluator and task subjectivity; reveals tacit criteria and temporal inconsistency as core obstacles.

Tzu-Mi Lin·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Why Geometric Continuity Emerges in Deep Neural Networks: Residual Connections and Rotational Symmetry Breaking

Geometric continuity in deep networks explained by residual connections and symmetry-breaking nonlinearities coordinating weight updates across layers.

Kyungwon Jeong·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Skill Neologisms: Towards Skill-based Continual Learning

Skill neologisms—soft tokens optimized for new capabilities—enable selective LLM skill extension without catastrophic forgetting or context limits.

Antonin Berthon·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Reliable Modeling of Distribution Shifts via Displacement-Reshaped Optimal Transport

ReshapeOT improves optimal transport for distribution shifts by reshaping ground metrics using observed sample displacements.

Philip Naumann·2 months ago

Simon Willison· ANALYST

Vibe coding and agentic engineering are getting closer than I'd like

Simon Willison observes convergence between vibe coding and agentic engineering in practical AI-assisted development workflows.

Simon Willison·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

TabEmbed introduces first generalist embedding model for tabular data and TabBench, a comprehensive benchmark for tabular understanding evaluation.

Minjie Qiang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EP-GRPO: Entropy-Progress Aligned Group Relative Policy Optimization with Implicit Process Guidance

EP-GRPO fixes credit assignment failures in GRPO-based LLM reasoning via token-level entropy, polarity-aware rewards, and zero-variance collapse mitigation.

Song Yu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Delving into Non-Exchangeability for Conformal Prediction in Graph-Structured Multivariate Time Series

Conformal prediction applied to graph-structured time series; addresses non-exchangeability via spectral graph theory for rigorous uncertainty quantification.

Ruichao Guo·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

KernelBench-X evaluates LLM-generated Triton GPU kernels across 176 tasks; finds task structure explains 3x more correctness variance than method design.

Han Wang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Order-based Rehearsal Learning

First order-based rehearsal learning method for avoiding undesired futures; uses ordinal structures instead of graph estimation.

Yu-Xuan Tao·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On the Influence of the Feature Computation Budget on Per-Instance Algorithm Selection for Black-Box Optimization

Study determines optimal feature computation budget fraction for per-instance algorithm selection in black-box optimization.

Koen van der Blom·2 months ago

← Front Page30 stories

← Newer Older →

The Archive

Apple to pay $250M to settle lawsuit over Siri’s delayed AI features

Learned Neighbor Trust for Collaborative Deployment in Model-Agnostic Decentralized Learning

Scalable inference of spatial regions and temporal signatures from time series

Uno-Orchestra: Parsimonious Agent Routing via Selective Delegation

Misaligned by Reward: Socially Undesirable Preferences in LLMs

I paused, went to eat, took shower, 1 prompt later, 45% (8 mins into a new session)

Agentic Vulnerability Reasoning on Windows COM Binaries

Ethos raises $22.75M from a16z for its expert network with voice onboarding

Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation

DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

Federated Learning for Early Prediction of EV Charging Demand

None of this will ever get stolen

Self-Induced Outcome Potential: Turn-Level Credit Assignment for Agents without Verifiers

Higher usage limits for Claude and a compute deal with SpaceX

Conceptors for Semantic Steering

On-line Learning in Tree MDPs by Treating Policies as Bandit Arms

At TechCrunch Disrupt 2026, all your M&A questions will be answered

Architectural Constraints Alignment in AI-assisted, Platform-based Service Development

Why Expert Alignment Is Hard: Evidence from Subjective Evaluation

Why Geometric Continuity Emerges in Deep Neural Networks: Residual Connections and Rotational Symmetry Breaking

Skill Neologisms: Towards Skill-based Continual Learning

Reliable Modeling of Distribution Shifts via Displacement-Reshaped Optimal Transport

Vibe coding and agentic engineering are getting closer than I'd like

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

EP-GRPO: Entropy-Progress Aligned Group Relative Policy Optimization with Implicit Process Guidance

Delving into Non-Exchangeability for Conformal Prediction in Graph-Structured Multivariate Time Series

KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

Order-based Rehearsal Learning

On the Influence of the Feature Computation Budget on Per-Instance Algorithm Selection for Black-Box Optimization