The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

How Elon Musk left OpenAI, according to Greg Brockman

Cutthroat negotiations between startup founders are rarely shared so publicly, especially when a company becomes as world-changing as OpenAI.

Tim Fernholz·2 months ago

r/ClaudeAI· COMMUNITY

What it means that Elon just rented out all his GPUs to Anthropic

Reddit speculation that Elon/xAI rented GPUs to Anthropic, interpreted as signal of competitive pressure and capacity constraints.

u/ContextCustodian·2 months ago·34 pts / 30 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Taming Outlier Tokens in Diffusion Transformers

Study identifies outlier tokens in Diffusion Transformers that attract disproportionate attention in image generation, affecting both encoder and denoiser layers.

Xiaoyu Wu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Implicit Representations of Grammaticality in Language Models

Research shows pretrained language models implicitly distinguish grammaticality from string probability through internal representations, despite surface statistics.

Yingshan Susan Wang·2 months ago

The Verge AI· PRESS

Mira Murati tells the court that she couldn’t trust Sam Altman’s words

Mira Murati, OpenAI's former CTO, has testified under oath that CEO Sam Altman lied to her about the safety standards for a new AI model. In a video deposition shown during the ongoing Musk v. Altman trial on Wednesday, Murati said Altman falsely stated that OpenAI's legal department determined a new AI model did not need to go through the company's deployment safety board. "As you understand it, was Mr. Altman telling the truth when he made that statement to you?" Murati was asked in the deposition. "No," Murati said. Murat said that during her tenure at OpenAI, Altman made her work more dif...

Jay Peters·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Grokability in five inequalities

Grok AI model discovered five new mathematical inequalities and bounds in convex geometry and combinatorics, verified by human authors.

Paata Ivanisvili·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Almost-Orthogonality in Lp Spaces: A Case Study with Grok

Mathematical analysis refuting Carbery's triangle inequality conjecture for Lp spaces with counterexample and sharp bounds on exponent.

Ziang Chen·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents

LongSeeker proposes Context-ReAct paradigm for elastic context management in long-horizon search agents, maintaining trajectory at variable detail levels.

Yijun Lu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Sharp Capacity Thresholds in Linear Associative Memory: From Winner-Take-All to Listwise Retrieval

Theoretical analysis establishes sharp capacity thresholds for linear associative memory, showing d²∼n log n scaling for top-1 retrieval via phase transition.

Nicholas Barnfield·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Estimating the expected output of wide random MLPs more efficiently than sampling

Method estimates expected outputs of wide random MLPs without sampling by propagating activation distributions via cumulants and Hermite expansions.

Wilson Wu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Understanding In-Context Learning for Nonlinear Regression with Transformers: Attention as Featurizer

Theoretical framework explains transformers' in-context learning on nonlinear regression by showing attention mechanisms construct polynomial and spline bases.

Alexander Hsu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MRI-Eval: A Tiered Benchmark for Evaluating LLM Performance on MRI Physics and GE Scanner Operations Knowledge

MRI-Eval benchmark with 1365 items assesses LLM performance on MRI physics and GE scanner operations with tiered difficulty and diagnostic conditions.

Perry E. Radau·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Life Gives You BC, Make Q-functions: Extracting Q-values from Behavior Cloning for On-Robot Reinforcement Learning

Q2RL algorithm extracts Q-functions from behavior cloning for efficient offline-to-online robot learning, preventing policy collapse via distribution mismatch.

Lakshita Dodeja·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Design Conductor 2.0: An agent builds a TurboQuant inference accelerator in 80 hours

Design Conductor 2.0 autonomous agent builds hardware accelerators (TurboQuant) in 80 hours using frontier April 2026 models, demonstrating 80x capability scaling over prior work.

The Verkor Team·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The First Token Knows: Single-Decode Confidence for Hallucination Detection

First-token confidence (phi_first) from single greedy decode detects LLM hallucinations as effectively as multi-sample semantic self-consistency with lower computational cost.

Mina Gabriel·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Geometry-Aware State Space Model: A New Paradigm for Whole-Slide Image Representation

Geometry-Aware State Space Model applies hyperbolic geometry to whole-slide histopathology image analysis via Multiple Instance Learning, improving patch aggregation for gigapixel resolution.

Enhui Chai·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data Augmentation

SemEval-2026 Task 9 system fine-tunes Gemma 3 (12B/27B) per-language with LoRA and GPT-4o-mini synthetic data augmentation for 22-language polarization detection.

Srikar Kashyap Pulipaka·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Aes3D: Aesthetic Assessment in 3D Gaussian Splatting

Aes3D proposes aesthetic assessment framework for 3D Gaussian Splatting, addressing composition and visual appeal evaluation beyond reconstruction fidelity.

Chuanzhi Xu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Superposition Is Not Necessary: A Mechanistic Interpretability Analysis of Transformer Representations for Time Series Forecasting

Sparse autoencoders reveal PatchTST uses non-superposed, task-specific representations for time-series forecasting, explaining competitiveness against simple linear models.

Alper Yıldırım·2 months ago

TechCrunch AI· PRESS

SpaceX may spend up to $119 billion on ‘Terafab’ chip factory in Texas

SpaceX, Elon Musk's space company that also houses his AI company, xAI, is considering spending $55 billion, at least initially, to build a semiconductor factory in Texas, according to a filing with Grimes County.

Ram Iyer·2 months ago·+ covered by others

TechCrunch AI· PRESS

DeepSeek could hit $45B valuation from its first investment round

In just a few weeks of talks, DeepSeek's potential valuation has reportedly soared from $20 billion to $45 billion.

Julie Bort·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

What Matters in Practical Learned Image Compression

Comprehensive study of learned image compression design choices balancing perceptual quality and runtime, introducing novel techniques for practical human-visual-system-optimized codecs.

Kedar Tatwawadi·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Human-AI Co-Mentorship in Project-Based Learning: A Case Study in Financial Forecasting

Case study of high-school/undergraduate students using AI tools for financial forecasting research, highlighting human-AI co-mentorship acceleration of learning outcomes.

Freyaa Chawla·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Executable World Models for ARC-AGI-3 in the Era of Coding Agents

Coding agent with executable Python world models, verification, and simplicity-bias refactoring solves 25 public ARC-AGI-3 games without task-specific logic.

Sergey Rodionov·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Low-Cost Black-Box Detection of LLM Hallucinations via Dynamical System Prediction

Koopman operator theory applied to LLM embeddings as dynamical system enables low-cost black-box hallucination detection without sampling or external retrieval.

Dan Wilson·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Transformed Latent Variable Multi-Output Gaussian Processes

T-LVMOGP framework scales Multi-Output Gaussian Processes to high-dimensional outputs via transformed latent variables.

Xiaoyu Jiang·2 months ago

r/ClaudeAI· COMMUNITY

Anthropic Just Secured a Reserve.

Anthropic secures partnership with SpaceX for 300MW+ compute at Colossus 1, adding 220k+ NVIDIA GPUs within one month.

u/DragonflyOk7139·2 months ago·27 pts / 11 comm

r/Anthropic· COMMUNITY

Double limits!!

Partnership with spaceX, anthropic just doubled the limits, source: https://x.ai/news/anthropic-compute-partnership

u/Aromatic_Air7139·2 months ago·16 pts / 10 comm

Ars Technica AI· PRESS

Google DeepMind partners with EVE Online for AI model testing

Move comes as CCP Games spends $120M to go independent, rebrands as Fenris Creations.

Kyle Orland ·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Joint Treatment Effect Estimation from Incomplete Healthcare Data: Temporal Causal Normalizing Flows with LLM-driven Evolutionary MNAR Imputation

CausalFlow-T applies DAG-constrained normalizing flows and LLM-driven imputation for treatment effect estimation in incomplete EHR data.

Olivia Jullian Parra·2 months ago

← Front Page30 stories

← Newer Older →