The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

How many are feeling this sense of betrayal?

Reddit user expresses dissatisfaction with Claude's behavior shift, comparing favorably to OpenAI and Gemini but criticizing recent changes.

u/Gabelawn·2 months ago·10 pts / 45 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Agentic MIP Research: Accelerated Constraint Handler Generation

Agentic framework embeds LLM agents in SCIP solver harness to auto-generate and benchmark constraint handlers.

Liding Xu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Open Ontologies: Tool-Augmented Ontology Engineering with Stable Matching Alignment

Open Ontologies applies stable matching to LLM-driven ontology alignment, achieving F1=0.832 on OAEI Anatomy benchmark.

Fabio Rovai·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift

Selective imitation learning framework enables agents to abstain from acting when demonstrations are uninformative under dynamics shift.

Surbhi Goel·2 months ago

TechCrunch AI· PRESS

So you’ve heard these AI terms and nodded along; let’s fix that

The rise of AI has brought an avalanche of new terms and slang. Here is a glossary with definitions of some of the most important words and phrases you might encounter.

Natasha Lomas, Romain Dillet, Kyle Wiggers, Lucas Ropek·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Navigating LLM Valley: From AdamW to Memory-Efficient and Matrix-Based Optimizers

Survey of optimizer design for LLM pretraining covering Adam alternatives, memory efficiency, and matrix-wise updates at scale.

Aditya Ranganath·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms

Self-supervised learning framework for hierarchical representation learning from long-sequence wearable sensor waveforms.

Peng Cao·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Prediction Bottlenecks Don't Discover Causal Structure (But Here's What They Actually Do)

Falsification benchmark reveals prediction bottlenecks do not recover causal structure; provides standardized test suite across architectures.

Ankit Hemant Lade·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CIVeX: Causal Intervention Verification for Language Agents

CIVeX verifies causal effects of tool-use actions in LLM agents via structural causal queries and identifiability checks.

Fabio Rovai·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

WorldSpeech: A Multilingual Speech Corpus from Around the World

WorldSpeech multilingual corpus: 65k hours aligned audio-transcript data across 76 languages for low-resource ASR improvement.

Antonis Asonitis·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Sparse Layers are Critical to Scaling Looped Language Models

Looped-MoE transformers scale better than dense looped models; sparse layers enable routing diversity across repeated passes.

Ryan Lee·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FORTIS: Benchmarking Over-Privilege in Agent Skills

FORTIS benchmark evaluates over-privilege in LLM agent skills: minimal selection and boundary-respecting execution.

Shawn Li·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Objective-Specific Privileged Bases via Full-Prefix Matryoshka Learning

Full-prefix Matryoshka learning induces task-aligned privileged bases with per-dimension structure reflecting information content.

Arghamitra Talukder·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Do LLMs Experience an Internal Polylogue? Investigating Reasoning through the Lens of Personas

Persona vectors in LLM activations form dynamic polylogues during reasoning; polylogue features predict reasoning correctness.

Nils A. Herrmann·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Revisiting Mixture Policies in Entropy-Regularized Actor-Critic

Mixture policies in continuous RL offer theoretical flexibility but lack practical reparameterization tricks; study questions whether complexity outweighs benefits in state-of-the-art algorithms.

Jiamin He·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Lost in Translation? Exploring the Shift in Grammatical Gender from Latin to Occitan

Deep learning framework analyzes grammatical gender shift from Latin to Occitan with improved tokenization for low-resource historical NLP.

Ahan Chatterjee·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Predicting Large Model Test Losses with a Noisy Quadratic System

Predictive model estimates large model pre-training loss from N, B, K parameters; outperforms Chinchilla on extrapolated compute budgets up to 1000x.

Chuning Li·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Self-Play: Hierarchical Reasoning for Continuous Motion in Closed-Loop Traffic Simulation

Hierarchical MARL architecture for traffic simulation combines multi-agent interaction reasoning with continuous trajectory planning beyond self-play equilibrium.

Weifan Zhang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology

Meow-Omni 1: quad-modal MLLM for feline behavior analysis with high-frequency biological time-series data to decode animal intent beyond surface patterns.

Jucheng Hu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AlphaExploitem: Going Beyond the Nash Equilibrium in Poker by Learning to Exploit Suboptimal Play

AlphaExploitem extends AlphaHoldem for poker by learning to exploit suboptimal play beyond Nash equilibrium using hierarchical transformer reasoning over hand history.

Vlad Murgoci·2 months ago

r/LocalLLaMA· COMMUNITY

Running Minimax 2.7 at 100k context on strix halo

User shares llama-server configuration for running Minimax 2.7 at 100k context on Strix Halo hardware with detailed tuning notes.

u/Zc5Gwu·2 months ago·43 pts / 18 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

From Traditional Taggers to LLMs: A Comparative Study of POS Tagging for Medieval Romance Languages

Empirical evaluation of LLMs vs. traditional taggers for POS tagging in Medieval Occitan, Catalan, French under zero-shot, few-shot, and transfer learning.

Matthias Schöffel·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FedVSSAM: Mitigating Flatness Incompatibility in Sharpness-Aware Federated Learning

FedVSSAM identifies flatness incompatibility in federated SAM under data heterogeneity; proposes fix to align local and global flat minima in distributed learning.

Bingnan Xiao·2 months ago

r/ClaudeAI· COMMUNITY

Opus's thoughts on Marc Andreesen's system prompt

[https://claude.ai/share/12659fcf-c1c8-4bbb-bc45-b41b26cd8b69](https://claude.ai/share/12659fcf-c1c8-4bbb-bc45-b41b26cd8b69)

u/rm-rf-rm·2 months ago·38 pts / 7 comm

r/ClaudeAI· COMMUNITY

What’s currently the best/safest way to build an autonomous AI personal assistant?

Reddit discussion on safe autonomous agent architectures for personal assistants with tool access, exploring sandboxing, MCP, and approval-based patterns.

u/StableOk24·2 months ago·20 pts / 22 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Evaluating Federated Learning approaches for mammography under breast density heterogeneity

Federated learning evaluation for mammography under breast density heterogeneity; assesses robustness of FL algorithms in realistic multicenter clinical settings.

Gonzalo Iñaki Quintana·2 months ago

r/LocalLLaMA· COMMUNITY

I am overwhelmed by Harnesses

Reddit user seeks advice on LLaMA inference harnesses; discusses fragmentation and compatibility issues with local LLM tooling.

u/Available_Hornet3538·2 months ago·40 pts / 107 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

BoostAPR: Boosting Automated Program Repair via Execution-Grounded Reinforcement Learning with Dual Reward Models

BoostAPR applies RL with dual reward models (sequence-level and line-level) to automated program repair, enabling credit assignment to critical code edits.

Yuanhao Li·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

MCP-Cosmos integrates world models into Model Context Protocol agents to bridge planning-execution gap via predictive task automation.

Giridhar Ganapavarapu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Data-driven Circuit Discovery for Interpretability of Language Models

Data-driven circuit discovery tests whether LMs implement single computational subgraphs per task, challenging hypothesis-driven interpretability methods.

Daking Rai·2 months ago

← Front Page30 stories

← Newer Older →