The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Etsy launches its app within ChatGPT as it continues its AI push

Etsy's new native app within ChatGPT aims to be a conversational shopping experience for users.

Lauren Forristal·2 months ago

EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics

EvoLM enables self-improvement in language models using co-evolved discriminative rubrics without external reward supervision.

Shuyue Stella Li·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On Adaptivity in Zeroth-Order Optimization

MEAZO: memory-efficient adaptive zeroth-order optimizer for LLM fine-tuning, outperforms ZO-Adam with scalar-only tracking.

Hassan Dbouk·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Memory-Efficient Continual Learning with CLIP Models

Distributionally robust continual learning method for CLIP models using dynamic per-class loss reweighting with small memory buffers.

Ryan King·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Quantifying the human visual exposome with vision language models

Vision language models quantify semantic richness of personal visual environments to predict mental health outcomes from 2674 participant photos.

Christian Rominger·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards

TraceLift: planner-executor framework trains LLM reasoning traces on executor-grounded rewards, not just final-answer correctness.

Tianyang Han·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MCJudgeBench: A Benchmark for Constraint-Level Judge Evaluation in Multi-Constraint Instruction Following

MCJudgeBench: benchmark for constraint-level evaluation of LLM judges in multi-constraint instruction following with per-constraint gold labels.

Jaeyun Lee·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Mechanical Conscience: A Mathematical Framework for Dependability of Machine Intelligenc

Mathematical framework for dependability of distributed collaborative intelligence systems where locally correct decisions compose into unsafe global behaviors.

Munkhdegerekh Batzorig·2 months ago

r/OpenAI· COMMUNITY

Chatgpt shows his love of goblins

Anecdotal Reddit post about ChatGPT's conversational behavior; no technical substance or news value.

u/batrix03·2 months ago·50 pts / 10 comm

r/LocalLLaMA· COMMUNITY

<thinking></thinking>

Incomplete post with no content.

u/Comfortable-Rock-498·2 months ago·52 pts / 14 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

SOAR: Real-Time Joint Optimization of Order Allocation and Robot Scheduling in Robotic Mobile Fulfillment Systems

SOAR: real-time joint optimization of order allocation and robot scheduling for robotic mobile fulfillment warehouse systems.

Yibang Tang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Complex Equation Learner: Rational Symbolic Regression with Gradient Descent in Complex Domain

Complex-valued gradient descent for symbolic regression enables discovery of equations with singularities and domain constraints like division and logarithms.

Sergei Garmaev·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

On Computing Total Variation Distance Between Mixtures of Product Distributions

Randomized algorithm approximates total variation distance between mixtures of product distributions with polynomial-time complexity bounds.

Weiming Feng·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TRACE: A Metrologically-Grounded Engineering Framework for Trustworthy Agentic AI Systems in Operationally Critical Domains

TRACE: engineering framework for trustworthy agentic AI in critical domains combining reference architecture, trust metrics, and bounded human supervision.

Serhii Zabolotnii·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Domain Incremental Continual Learning Benchmark for ICU Time Series Model Transportability

Domain incremental learning benchmark for ICU time-series model transfer across hospitals with domain shift and patient data heterogeneity.

Ryan King·2 months ago

r/Anthropic· COMMUNITY

I literally just started a new chat for a project. The project has 3 Markdown files, around 200 lines each, and after just 4 messages I’ve already hit 75% of my Pro plan usage. Can someone tell me what the hell is going on?

u/richbaro23·2 months ago·10 pts / 30 comm

r/LocalLLaMA· COMMUNITY

Heretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more

Heretic 1.3 adds reproducibility, integrated benchmarking, reduced VRAM, and broader model support for model decensoring.

u/-p-e-w-·2 months ago·54 pts / 10 comm

The Verge AI· PRESS

OpenAI is reportedly launching a phone for ChatGPT

OpenAI's first hardware product might be a phone instead of a mysterious Jony Ive gadget. As reported by MacRumors, supply chain analyst Ming-Chi Kuo shared details about the rumored phone, claiming OpenAI is "fast-tracking" it and aiming to start mass production in early 2027. According to Kuo, the phone will run on a "customized version of the [MediaTek] Dimensity 9600," which is expected to launch this fall and follow up the Dimensity 9500 currently powering phones like the Vivo X300 Pro and the Oppo Find X9 Pro. The custom chip's "headline spec" will be its image signal processor (ISP), w...

Stevie Bonifield·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Reproducing Complex Set-Compositional Information Retrieval

Reproducibility study of neural retrievers on set-compositional queries; introduces LIMIT+ benchmark for constraint-satisfaction information retrieval.

Vincent Degenhart·2 months ago

r/singularity· COMMUNITY

New Boston Dynamics Atlas trick

Boston Dynamics Atlas demonstrates new physical capability; limited technical details available from social media post.

u/Distinct-Question-16·2 months ago·301 pts / 63 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Realizable Bayes-Consistency for General Metric Losses

Theoretical characterization of Bayes-consistency for learning with general metric losses in the realizable setting.

Dan Tsir Cohen·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models

RoboAlign-R1: reward-aligned post-training for robot video world models with stabilized long-horizon inference and RobotWorldBench evaluation.

Hao Wu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Multimodal Learning on Low-Quality Data with Conformal Predictive Self-Calibration

Conformal Predictive Self-Calibration framework for multimodal learning handles modality imbalance and noisy corruption via predictive uncertainty.

Xun Jiang·2 months ago

r/singularity· COMMUNITY

Google’s AI architect, Demis Hassabis, lived rent-free in Elon Musk’s head

Reddit post claims Musk's fear of DeepMind CEO Hassabis motivated OpenAI founding; cites trial testimony about 2015 meeting.

u/Darqseyd·2 months ago·116 pts / 40 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

The Manokhin Probability Matrix: A Diagnostic Framework for Classifier Probability Quality

Manokhin Probability Matrix: diagnostic framework separating classifier calibration and discriminatory power via 2x2 archetype taxonomy.

Valery Manokhin·2 months ago

r/OpenAI· COMMUNITY

OpenAI’s new phone being fast-tracked to launch next year, per report

OpenAI reportedly planning smartphone launch for next year; unconfirmed hardware product outside core AI model development.

u/Cristiano1·2 months ago·51 pts / 29 comm

r/singularity· COMMUNITY

Hyundai Reportedly Demanding ‘Tens of Thousands’ of Boston Dynamics Robots ASAP

Hyundai reportedly seeks tens of thousands of Boston Dynamics robots for manufacturing deployment, signaling commercial robotics scaling.

u/Tkins·2 months ago·226 pts / 20 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Agentic-imodels: Evolving agentic interpretability tools via autoresearch

Agentic-imodels: autoresearch loop evolving interpretable data-science tools optimized for agent consumption rather than human readability.

Chandan Singh·2 months ago

r/OpenAI· COMMUNITY

OpenAI is working on a new "Personal Wiki" ("lore") in ChatGPT.

OpenAI developing persistent user context feature ('lore') for ChatGPT to maintain conversation history and preferences.

u/Distinct_Fox_6358·2 months ago·52 pts / 14 comm

TechCrunch AI· PRESS

Meta will use AI to analyze height and bone structure to identify if users are underage

The visual analysis system is now operating in select countries, but Meta says it's working toward a broader rollout.

Aisha Malik·2 months ago

← Front Page30 stories

← Newer Older →

The Archive

Etsy launches its app within ChatGPT as it continues its AI push

EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics

On Adaptivity in Zeroth-Order Optimization

Memory-Efficient Continual Learning with CLIP Models

Quantifying the human visual exposome with vision language models

Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards

MCJudgeBench: A Benchmark for Constraint-Level Judge Evaluation in Multi-Constraint Instruction Following

Mechanical Conscience: A Mathematical Framework for Dependability of Machine Intelligenc

Chatgpt shows his love of goblins

&lt;thinking&gt;&lt;/thinking&gt;

SOAR: Real-Time Joint Optimization of Order Allocation and Robot Scheduling in Robotic Mobile Fulfillment Systems

Complex Equation Learner: Rational Symbolic Regression with Gradient Descent in Complex Domain

On Computing Total Variation Distance Between Mixtures of Product Distributions

TRACE: A Metrologically-Grounded Engineering Framework for Trustworthy Agentic AI Systems in Operationally Critical Domains

A Domain Incremental Continual Learning Benchmark for ICU Time Series Model Transportability

hello????

Heretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more

OpenAI is reportedly launching a phone for ChatGPT

Reproducing Complex Set-Compositional Information Retrieval

New Boston Dynamics Atlas trick

Realizable Bayes-Consistency for General Metric Losses

RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models

Multimodal Learning on Low-Quality Data with Conformal Predictive Self-Calibration

Google’s AI architect, Demis Hassabis, lived rent-free in Elon Musk’s head

The Manokhin Probability Matrix: A Diagnostic Framework for Classifier Probability Quality

OpenAI’s new phone being fast-tracked to launch next year, per report

Hyundai Reportedly Demanding ‘Tens of Thousands’ of Boston Dynamics Robots ASAP

Agentic-imodels: Evolving agentic interpretability tools via autoresearch

OpenAI is working on a new "Personal Wiki" ("lore") in ChatGPT.

Meta will use AI to analyze height and bone structure to identify if users are underage

<thinking></thinking>