The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090

PFlash: speculative prefill technique achieves 10x speedup on 128K context with quantized 27B models on RTX 3090, open-source C++/CUDA implementation.

u/sandropuppo·2 months ago·68 pts / 17 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Deep Kernel Learning for Stratifying Glaucoma Trajectories

Deep kernel learning with transformer embeddings stratifies glaucoma patient risk from sparse EHR data; medical ML application without LLM/frontier AI component.

Bruce Rushing·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FinSafetyBench: Evaluating LLM Safety in Real-World Financial Scenarios

FinSafetyBench: bilingual red-teaming benchmark (14 subcategories) for evaluating LLM refusal of financial crimes and ethics violations grounded in real cases.

Yutao Hou·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning How and What to Memorize: Cognition-Inspired Two-Stage Optimization for Evolving Memory

MemCoE: cognition-inspired two-stage memory optimization for LLM agents to learn personalized long-term user preferences within context windows.

Derong Xu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

FedKPer: Tackling Generalization and Personalization in Medical Federated Learning via Knowledge Personalization

FedKPer addresses generalization/personalization in medical federated learning via knowledge personalization; healthcare ML infrastructure without LLM focus.

Zoe Fowler·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptive Querying with AI Persona Priors

Persona-induced latent variable model for adaptive user querying under budget constraints; ML methodology tangential to frontier LLM research.

Kaizheng Wang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

ML-Bench&Guard: policy-grounded multilingual safety benchmark (14 languages) aligning LLMs with region-specific regulations and cultural context.

Yunhan Zhao·2 months ago

r/Anthropic· COMMUNITY

Extreme hallucinations output from Opus today (1st May) and yesterday

Reddit user reports severe hallucinations and task non-compliance in Claude Opus 4.7 on May 1st; anecdotal complaint without reproduction details.

u/hamada147·2 months ago·17 pts / 11 comm

r/singularity· COMMUNITY

My dream of a fully generative game is getting pretty close to possible now. I made a demo where you can prompt any spell and fight online.

Developer demo of generative game engine using Gemini 3 for spell generation with 6-player multiplayer physics simulation.

u/VirtualJamesHarrison·2 months ago·111 pts / 26 comm

The Verge AI· PRESS

Pentagon strikes classified AI deals with OpenAI, Google, and Nvidia — but not Anthropic

The Pentagon has struck deals with OpenAI, Google, Microsoft, Amazon, Nvidia, Elon Musk's xAI, and the startup Reflection, allowing the agency to use their AI tools in classified settings, according to an announcement on Friday. At the same time, the Defense Department has left out Anthropic - which it previously used for classified information - after declaring it a supply-chain risk. This builds upon deals with OpenAI and xAI, which have already reached agreements with the Pentagon for the "lawful" use of their AI systems. A report from The Information suggests Google has struck a similar a...

Emma Roth·2 months ago

r/LocalLLaMA· COMMUNITY

GitHub - intel/auto-round: A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.

Intel releases AutoRound, a low-bit quantization algorithm optimized for CPU/XPU/CUDA with vLLM and Transformers compatibility.

u/muyuu·2 months ago·41 pts / 23 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Evaluating the Architectural Reasoning Capabilities of LLM Provers via the Obfuscated Natural Number Game

Obfuscated Natural Number Game benchmarks LLM prover architectural reasoning vs. pattern matching; evaluates formal theorem-proving capabilities beyond saturation.

Lixing Li·2 months ago

TechCrunch AI· PRESS

Musk v. Altman is just getting started

Elon Musk spent the better part of three days on the witness stand this week in his lawsuit against OpenAI, and it’s already getting messy. Emails, texts, and his own tweets are surfacing in court, and there are plenty more witnesses to come. Musk’s argument against OpenAI? By converting the company to a for-profit model, Sam Altman betrayed the “nonprofit for the […]

Theresa Loconsolo·2 months ago·+ covered by others

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Benchmarks: MathArena as an Evaluation Platform for Mathematics with LLMs

MathArena: continuously-maintained evaluation platform aggregating mathematics benchmarks to track LLM progress; successor to static math benchmarks.

Jasper Dekoninck·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning

Augmented Lagrangian Multiplier Network stabilizes state-wise constraint enforcement in RL; safety optimization methodology without LLM specificity.

Jiaming Zhang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

InpaintSLat: Inpainting Structured 3D Latents via Initial Noise Optimization

InpaintSLat: training-free 3D inpainting via initial noise optimization in latent diffusion; computer vision task orthogonal to LLM/frontier AI focus.

Jaeyoung Chung·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Spiking Sequence Machines and Transformers

Formalizes Phase-Latency Isomorphism showing spiking sparse distributed memory and transformers share five functional operations with cosine similarity retrieval.

Joy Bose·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Reinforcement Learning with Markov Risk Measures and Multipattern Risk Approximation

Introduces mini-batch Markov risk measures and multipattern Q-learning with regret bounds for risk-averse finite-horizon MDPs.

Andrzej Ruszczynski·2 months ago

The Verge AI· PRESS

Elon Musk had a bad week in court

Elon Musk is the one who wanted this trial. He has spent months claiming OpenAI "stole a nonprofit," and saying he was the actual driving force behind one of the most important companies currently in tech. All indications are that he won't win his case against the company, but he's fighting it anyway. So you'd think he'd have done better when it was his time to take the stand. Verge subscribers, don't forget you get exclusive access to ad-free Vergecast wherever you get your podcasts. Head here. Not a subscriber? You can sign up here. Instead, Musk spent much of the week arguing with lawyers ...

David Pierce·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

AdaMeZO enables Adam-style zeroth-order LLM fine-tuning without storing moment estimates, reducing GPU memory while maintaining convergence.

Zhijie Cai·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Budget Constraints as Riemannian Manifolds

Casts budget-constrained group assignment as Riemannian manifold optimization for mixed-precision quantization and expert selection.

Michael Helcig·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PEACE: Cross-modal Enhanced Pediatric-Adult ECG Alignment for Robust Pediatric Diagnosis

PEACE framework uses cross-modal alignment and curriculum learning for transfer of adult ECG models to pediatric diagnosis.

Xinran Liu·2 months ago

r/Anthropic· COMMUNITY

Are there Humans at Anthropic Support? Claude support is a joke: I paid €80, lost my work, and their AI refused to give me a human

I just went through one of the most infuriating support experiences with Claude / Anthropic, and I need to get this off my chest. I paid extra for Claude Design credits, about €80 worth, and used them to create actual designs I needed for work. Then those designs just vanished. Not “hard to find,” not “moved somewhere else”: gone. Completely disappeared after I paid for the service. I opened support and immediately asked for a refund or, at the very least, to speak to a human. What I got instead was Fin, the AI “agent,” which looped me endlessly through the same bullshit: “Try clearing cac...

u/Rob_Bob_you_choose·2 months ago·19 pts / 8 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

From Prediction to Practice: A Task-Aware Evaluation Framework for Blood Glucose Forecasting

Task-aware evaluation framework for blood glucose forecasting with event-level metrics addressing high-risk regimes in clinical decision support.

Alireza Namazi·2 months ago

The Verge AI· PRESS

Christian content creators are outsourcing AI slop to gig workers on Fiverr

In the beginning, platforms like Fiverr were places where people could hire freelancers to do specialized creative labor using skills that took years to develop. In the age of generative AI, though, many of these gig workers have embraced the technology in order to meet clients' demands. These workers' profiles emphasize that they can quickly (and cheaply) whip up images and videos of just about anything. But often, what their clients are looking for are dramatic animations inspired by the Christian Bible. On TikTok, YouTube, Instagram, and Facebook it is very easy to stumble across AI-genera...

Charles Pulliam-Moore·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning Multimodal Energy-Based Model with Multimodal Variational Auto-Encoder via MCMC Revision

Combines multimodal energy-based models with VAE refinement via MCMC to improve inter-modal dependency capture in generative modeling.

Jiali Cui·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding

On-policy self-distillation for GUI grounding provides dense token-level supervision from single rollouts in autonomous agent GUI interaction.

Yan Zhang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Bridging Graph Drawing and Dimensionality Reduction with Stochastic Stress Optimization

Adapts stochastic stress optimization from graph drawing to dimensionality reduction, replacing SMACOF with SGD-based methods.

Daniel Hangan·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Knowing when to trust machine-learned interatomic potentials

PROBE recasts MLIP uncertainty quantification as selective classification using frozen backbone embeddings for interatomic potential reliability.

Shams Mehdi·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Born-Qualified: An Autonomous Framework for Deploying Advanced Energy and Electronic Materials

Framework for autonomous materials discovery embedding manufacturability constraints to bridge lab-to-deployment gap.

Steven R. Spurgeon·2 months ago

← Front Page30 stories

← Newer Older →