The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

arXiv (cs.AI/CL/LG)· ACADEMIA

ProfiliTable: Profiling-Driven Tabular Data Processing via Agentic Workflows

ProfiliTable: multi-agent framework using profiling-driven agentic workflows for table cleaning and transformation.

Wei Liu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts

LLM agent framework for post-hoc crop yield forecast correction using domain tools on strawberry and corn data.

Matthew Beddows·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Fill the GAP: A Granular Alignment Paradigm for Visual Reasoning in Multimodal Large Language Models

GAP framework fixes feature-space mismatch in multimodal LLM visual reasoning by aligning latent token generation with input embedding norms.

Yanting Miao·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Context Convergence Improves Answering Inferential Questions

Study shows passage convergence—how effectively hints eliminate wrong answers—improves LLM performance on inferential QA over retrieved answers.

Jamshid Mozafari·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MetaColloc: Optimization-Free PDE Solving via Meta-Learned Basis Functions

MetaColloc meta-learns neural basis functions offline to solve PDEs at test time without retraining, replacing optimization with collocation assembly.

Zichuan Yang·1 month ago

TechCrunch AI· PRESS

Threads tests a Meta AI integration that works similarly to Grok

The feature is designed to help people get real-time context about trends and breaking stories, as well as receive recommendations, all within conversations.

Aisha Malik·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Classifier Context Rot: Monitor Performance Degrades with Context Length

Frontier models (Opus 4.6, GPT 5.4, Gemini 3.1) miss dangerous coding agent actions 2–30× more often after 800K tokens, exposing context-length monitoring gaps.

Sam Martin·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning

QAP-Router frames NP-hard qubit routing as dynamic quadratic assignment, using RL to exploit logical-qubit interactions for quantum compilation.

Kien X. Nguyen·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries

Extends SAGA agentic AI governance framework to decentralized Byzantine-resilient setting, protecting against malicious distributed providers.

Matthew D. Laws·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Family of Quaternion-Valued Differential Evolution Algorithms for Numerical Function Optimization

Adapts differential evolution optimization to quaternion-valued search spaces, potentially improving model compactness and AI training efficiency.

Gerardo Altamirano-Gomez·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

MedHopQA benchmark evaluates LLM biomedical reasoning via multi-hop disease-centered questions, resisting answer-elimination and training data contamination.

Rezarta Islamaj·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From Message-Passing to Linearized Graph Sequence Models

Linearized Graph Sequence Models reframe graph message-passing as sequence modeling to decouple computational depth from propagation depth.

Joël Mathys·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

$δ$-mem: Efficient Online Memory for Large Language Models

δ-mem adds compact online associative memory to LLM backbones via delta-rule updates, enabling efficient long-context reuse in agentic systems.

Jingdi Lei·1 month ago

The Verge AI· PRESS

Parents say ChatGPT got their son killed with bad advice on party drugs

The family of a 19-year-old college student is suing OpenAI over claims that his conversations with ChatGPT led to an accidental overdose. In the lawsuit filed on Tuesday, Sam Nelson's parents allege ChatGPT "encouraged" the teen to "consume a combination of substances that any licensed medical professional would have recognized as deadly," resulting in his death. Though ChatGPT initially pushed back on conversations about drug and alcohol use, the launch of GPT-4o in April 2024 changed the chatbot's behavior, according to the lawsuit. Following the update, ChatGPT "began to engage and advise...

Emma Roth·1 month ago

r/LocalLLaMA· COMMUNITY

Let's build claude code from scratch!

Community member shares video tutorial and GitHub repo implementing a minimal Claude-like coding assistant from scratch.

u/RoyalMaterial9614·1 month ago·49 pts / 32 comm

r/singularity· COMMUNITY

On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7

GPT-5.5 achieves first solve on ProgramBench hard/extreme tasks, substantially outperforming Claude Opus 4.7 on novel SWE benchmark.

u/socoolandawesome·1 month ago·197 pts / 29 comm

r/ClaudeAI· COMMUNITY

TUI to actually see what Claude Code is doing: cost, loops, tool commands…

Open-source TUI tool provides visibility into Claude Code agent loops, costs, and security issues; author reports $14K spend, 20% wasted iterations, and 3 credential leaks over 90 days.

u/WhichCardiologist800·1 month ago·21 pts / 14 comm

r/LocalLLaMA· COMMUNITY

1M datasets on HF !

Community milestone: 1M datasets published on Hugging Face, celebrated as progress for open-source AI.

u/qlhoest·1 month ago·60 pts / 10 comm

The Verge AI· PRESS

Sam Altman takes the stand in trial against Elon Musk

OpenAI CEO Sam Altman has begun his testimony against Elon Musk in a high-profile jury trial in a California federal courtroom. Altman, alongside OpenAI president Greg Brockman, is a primary defendant in the trial brought by Musk. Altman, Brockman, and Musk were all part of the initial founding team at OpenAI, with Musk investing up to $38 million in the ChatGPT-maker's early days. But the relationship between Musk and other OpenAI founders eventually soured, and Musk stepped away from the company, later going on to found his own direct competitor, xAI. In recent years, Musk and Altman have t...

Hayden Field·1 month ago

The Verge AI· PRESS

George Clooney, Tom Hanks, and Meryl Streep back new ‘Human Consent Standard’ for AI licensing

Hollywood actors and producers are standing behind a new AI licensing standard that will tell AI systems whether they'll need to pay to use a person's likeness, creative work, characters, and designs. With the Human Consent Standard, people can set terms for the use of their work or likeness, including giving AI systems full permission to use their content, allowing access with certain requirements, or restricting access entirely. The Human Consent Standard builds upon the Really Simple Licensing (RSL) Standard, which launched last year as a way for websites to signal how AI systems use their...

Emma Roth·1 month ago

The Verge AI· PRESS

Rivian’s AI-powered voice assistant is ready to roll

Rivian's AI-powered voice assistant is rolling out today to the company's vehicle fleet. The assistant will be available through a software update to all compatible Rivian Gen 1 and Gen 2 vehicle owners who subscribe to the company's Connect Plus cellular service, which costs $15 a month or $150 a year, or are in an active trial. First announced at last year's AI and Autonomy Day, the Rivian Assistant is powered by the company's Rivian Unified Intelligence, "a shared, multi-modal AI foundation" that is "interwoven" throughout the entire company. The assistant is deeply embedded in the vehicle...

Andrew J. Hawkins·1 month ago

r/Anthropic· COMMUNITY

Claude Code Seems Designed to Waste Your Tokens and Time.

So I thought I was assigning Claude a simple task of uploading a hero image from a folder within Cowork to a Wordpress page using the MCP connector. First it tried to compress the hell out of the image... nobody asked it to do that, then it went through all kinds of failed attempts to upload the image using bizarre methodologies. It rewrote the entire page content, despite me explicitly telling it to just edit the block in question.... *three times* in the same chat for other operations which were equally wasteful. It literally took 20 minutes and expended 60% of my tokens to perform this si...

u/mindspan·1 month ago·13 pts / 3 comm

OpenAI· FRONTIER

How finance teams use Codex

OpenAI case study on Codex adoption for finance workflows: MBRs, reporting, variance analysis, and planning scenarios.

OpenAI·1 month ago·+ covered by others

r/ClaudeAI· COMMUNITY

Opus 4.7 Prompt Guidance Guide, anyone tried this?

Reddit discussion of an unverified prompt guidance guide claiming to optimize Claude Opus 4.7 behavior; anecdotal user experiences without empirical validation.

u/kylecito·1 month ago·21 pts / 14 comm

r/LocalLLaMA· COMMUNITY

Local LLM autocomplete + agentic coding on a single 16GB GPU + 64GB RAM

Developer demonstrates local coding workflow using Qwen2.5-Coder-7B autocomplete and Qwen3.6-35B agentic model on RTX 5080 with RAM offloading.

u/grumd·1 month ago·40 pts / 31 comm

r/singularity· COMMUNITY

Demis Hassabis's Isomorphic Labs announces Series B investment round with $2.1B in new funding

Isomorphic Labs closes $2.1B Series B led by Temasek and Loweringthe Bar to scale AI for drug discovery and protein folding.

u/TorturedPoet30·1 month ago·211 pts / 20 comm

r/Anthropic· COMMUNITY

We’re feeling cynical about xAI’s big deal with Anthropic

Link: techcrunch.com

u/ThereWas·1 month ago·12 pts / 3 comm

r/LocalLLaMA· COMMUNITY

MagicQuant (v2.0) - Hybrid Mixed GGUF Models + Unsloth Dynamic Learned Quant Configurations + Benchmark table with collapsed winners and more

MagicQuant v2.0: hybrid GGUF quantization pipeline with learned mixed-precision configs, benchmarked across architectures.

u/crossivejoker·1 month ago·50 pts / 23 comm

r/OpenAI· COMMUNITY

Plumbers, electricians, and HVAC techs watching AI replace everyone except them.

Reddit discussion on occupational displacement from AI, noting trade work remains less automatable than white-collar roles.

u/vinaykrkatiyar·1 month ago·107 pts / 40 comm

Google DeepMind· FRONTIER

Co-Scientist: A multi-agent AI partner to accelerate research

Google DeepMind introduces Co-Scientist, a multi-agent AI system built on Gemini to accelerate collaborative scientific research workflows.

Google DeepMind·1 month ago

← Front Page30 stories

← Newer Older →

The Archive

ProfiliTable: Profiling-Driven Tabular Data Processing via Agentic Workflows

Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts

Fill the GAP: A Granular Alignment Paradigm for Visual Reasoning in Multimodal Large Language Models

Context Convergence Improves Answering Inferential Questions

MetaColloc: Optimization-Free PDE Solving via Meta-Learned Basis Functions

Threads tests a Meta AI integration that works similarly to Grok

Classifier Context Rot: Monitor Performance Degrades with Context Length

QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning

Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries

A Family of Quaternion-Valued Differential Evolution Algorithms for Numerical Function Optimization

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

From Message-Passing to Linearized Graph Sequence Models

$δ$-mem: Efficient Online Memory for Large Language Models

Parents say ChatGPT got their son killed with bad advice on party drugs

Let's build claude code from scratch!

On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7

TUI to actually see what Claude Code is doing: cost, loops, tool commands…

1M datasets on HF !

Sam Altman takes the stand in trial against Elon Musk

George Clooney, Tom Hanks, and Meryl Streep back new &#8216;Human Consent Standard&#8217; for AI licensing

Rivian’s AI-powered voice assistant is ready to roll

Claude Code Seems Designed to Waste Your Tokens and Time.

How finance teams use Codex

Opus 4.7 Prompt Guidance Guide, anyone tried this?

Local LLM autocomplete + agentic coding on a single 16GB GPU + 64GB RAM

Demis Hassabis's Isomorphic Labs announces Series B investment round with $2.1B in new funding

We’re feeling cynical about xAI’s big deal with Anthropic

MagicQuant (v2.0) - Hybrid Mixed GGUF Models + Unsloth Dynamic Learned Quant Configurations + Benchmark table with collapsed winners and more

Plumbers, electricians, and HVAC techs watching AI replace everyone except them.

Co-Scientist: A multi-agent AI partner to accelerate research

George Clooney, Tom Hanks, and Meryl Streep back new ‘Human Consent Standard’ for AI licensing