Section · The Wire

The Wire

A live dispatch from every source on the network. Chronological, ranked, and refreshed continuously as stories break.

Live feed

Sort

100 stories

[AINews] Claude Opus 5: Fable-level performance at Opus price (half Fable)

Anthropic releases Claude Opus 5 matching Fable performance at half the cost, demonstrating efficiency gains in model distillation.

Latent Space·1 day ago

Simon Willison· ANALYST

Introducing Claude Opus 5

Anthropic releases Claude Opus 5, matching Fable 5 frontier performance at half the cost, now leading Artificial Analysis leaderboard.

Simon Willison·2 days ago

Simon Willison· ANALYST

Ruff v0.16.0

Ruff v0.16.0 enables 413 default linting rules (up from 59), breaking existing CI pipelines and catching syntax/runtime errors previously uncaught.

Simon Willison·14 hours ago

Simon Willison· ANALYST

Quoting Boris Cherny

Claude Opus 5 achieves lowest prompt injection vulnerability rate across evals and red team testing, per Anthropic's system card.

Simon Willison·1 day ago

Anthropic· FRONTIER

Introducing Claude Opus 5

Anthropic releases Claude Opus 5 with improvements in agent execution, coding, and professional tasks.

Anthropic·2 days ago

Latent Space· ANALYST

[AINews] Black Forest Labs FLUX 3 - Multimodal Flow Models that beat Seedance 2.0, Gemini Omni and Grok Imagine, and FLUX-mimic video-action robotics model

Black Forest Labs releases FLUX 3 multimodal model with reported improvements over Gemini 2.0, Grok Imagine, and includes video-action robotics variant.

Latent Space·2 days ago

r/LocalLLaMA· COMMUNITY

Kimi K2.6 Released (huggingface)

Kimi K2.6 released on Hugging Face; availability announcement for open-weights download.

u/BiggestBau5·3 months ago·869 pts / 260 comm

r/singularity· COMMUNITY

Kimi 2.6 has been released

Kimi 2.6 released; details available in official release notes.

u/WhyLifeIs4·3 months ago·572 pts / 93 comm

r/LocalLLaMA· COMMUNITY

Gemma-4-E2B's safety filters make it unusable for emergencies

Google Gemma-4-E2B's safety filters render model unusable for emergency preparedness; blocks medical, water purification, maintenance info.

u/Unfounded_898·3 months ago·415 pts / 278 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

DONDO: Open w2v-BERT Speech-Recognition Base Models for African Languages

DONDO releases 26 open w2v-BERT speech recognition models for African languages spanning six countries, trained on religious text corpora.

Paul Azunre·3 days ago

TechCrunch AI· PRESS

Librarians are hosting viral ‘Avoiding AI’ workshops for people who are fed up with Big Tech

At libraries around the country, "Avoiding AI" workshops have elicited unprecedented demand.

Amanda Silberling·21 hours ago

Simon Willison· ANALYST

Quoting OpenAI

OpenAI launches GPT-5.6 series (Sol, Terra, Luna) with tiered performance/cost; limited preview underway, general availability in weeks.

Simon Willison·30 days ago

Simon Willison· ANALYST

The new GPT-5.6 family: Luna, Terra, Sol

OpenAI releases GPT-5.6 family (Luna, Terra, Sol) with tiered pricing; claims superior agentic performance vs. Claude Opus/Fable on benchmarks.

Simon Willison·17 days ago

Simon Willison· ANALYST

What's new in Claude Sonnet 5

Claude Sonnet 5 launched with performance near Opus 4.8 at lower cost; includes cyber-task restrictions aligned with Opus 4.7/4.8 safeguards.

Simon Willison·26 days ago

Latent Space· ANALYST

[AINews] Google I/O 2026: Gemini 3.5 Flash, Omni (NanoBanana for Video), Spark (background agents), and Antigravity 2.0

Google I/O 2026: Gemini 3.5 Flash, multimodal Omni, Spark background agents, Antigravity 2.0.

Latent Space·2 months ago

Simon Willison· ANALYST

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Qwen3.6-27B dense model matches Qwen3.5-397B MoE on coding benchmarks at 15x smaller size, shipping quantized versions for local deployment.

Simon Willison·3 months ago

Simon Willison· ANALYST

Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

Google releases Gemini 3.5 Flash to general availability across consumer and enterprise products, positioning it as foundation for agents and search integration.

Simon Willison·2 months ago

OpenAI· FRONTIER

Introducing GPT-5.5

OpenAI releases GPT-5.5, advancing capability in coding, research, and data analysis with improved speed and performance.

OpenAI·3 months ago·+ covered by others

Simon Willison· ANALYST

Kimi K3, and what we can still learn from the pelican benchmark

Moonshot AI releases Kimi K3 (2.8T params), claims top performance vs. Claude Opus 4.8 Max and GPT-5.5, promises open-weight release by July 2026.

Simon Willison·10 days ago

Simon Willison· ANALYST

microsoft/VibeVoice

Microsoft releases VibeVoice, MIT-licensed speech-to-text model with speaker diarization; 17.3GB weights available with 4-bit MLX quantization.

Simon Willison·3 months ago

Simon Willison· ANALYST

DeepSeek V4 - almost on the frontier, a fraction of the price

DeepSeek releases V4-Pro (1.6T params, 49B active) and V4-Flash (284B/13B) with 1M context, largest open-weights models, MIT licensed.

Simon Willison·3 months ago

Google DeepMind· FRONTIER

Introducing Gemini 3.6 Flash, 3.5 Flash-Lite, and 3.5 Flash Cyber

Google DeepMind releases Gemini 3.6 Flash, 3.5 Flash-Lite, and 3.5 Flash Cyber models for inference efficiency and security tasks.

Google DeepMind·5 days ago

Latent Space· ANALYST

[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips

DeepSeek releases V4 Pro (1.6T-A49B) and Flash (284B-A13B) models optimized for Huawei Ascend chips, no longer leading benchmarks.

Latent Space·3 months ago

Simon Willison· ANALYST

Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding

DeepReinforce releases Ornith-1.0, MIT-licensed open-weights model (9B–397B variants) for agentic coding, built on Gemma 4 and Qwen 3.5, achieving SOTA on coding benchmarks.

Simon Willison·27 days ago

Simon Willison· ANALYST

Introducing Muse Spark 1.1

Meta releases Muse Spark 1.1 with API access and improved agentic tool calling and computer use capabilities.

Simon Willison·17 days ago

Latent Space· ANALYST

[AINews] "Laguna S 2.1 Released: Cheaper than Deepseek v4 Flash, Better than V4 Pro"

Laguna S 2.1, a 118B MoE model from Poolside AI, achieves Deepseek v4 Pro performance at lower cost than v4 Flash.

Latent Space·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MultiSynt/MT: Trillion-Token Multi-Parallel Pre-Training Data Translated Across 36 Languages

MultiSynt/MT releases 4.8 trillion tokens of open synthetic parallel pre-training data across 36 European languages via Tower+ and OPUS-MT translation.

Maximilian Idahl·25 days ago

Simon Willison· ANALYST

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

OpenAI releases ChatGPT Images 2.0; Willison benchmarks improvement via Where's Waldo-style prompt testing against predecessor.

Simon Willison·3 months ago

OpenAI· FRONTIER

Introducing GPT-Live

OpenAI releases GPT-Live, a new voice model generation for natural human-AI interaction in ChatGPT Voice.

OpenAI·19 days ago

Simon Willison· ANALYST

Inkling: Our open-weights model

Thinking Machines Lab releases Inkling, a 975B-parameter open-weights MoE multimodal model trained on 45T tokens.

Simon Willison·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Expanding Flow Maps

Expanding Flow Maps (EFMs) enable flow-based generative models to handle variable-dimensionality distributions via expanding interpolants with conditional noise.

Sophia Tang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Graph Learning on Ensembles of Cyclic Peptides: An Investigation of Molecular Ensemble Modeling

EnsembleEGNN molecular foundation model encodes conformational ensembles of cyclic peptides using equivariant GNNs with set attention pooling.

Aaron Feller·3 days ago

Simon Willison· ANALYST

Introducing talkie: a 13B vintage language model from 1930

talkie-1930-13b: 13B model trained on pre-1931 English text, released by Levine, Duvenaud, Radford under Apache 2.0.

Simon Willison·3 months ago

Simon Willison· ANALYST

tencent/Hy3

Tencent releases Hy3, a 295B-param MoE model with 21B active params under Apache 2.0, claiming performance parity with 2-5x larger open-source competitors.

Simon Willison·20 days ago

Latent Space· ANALYST

[AINews] Kimi K3 2.8T-A50B: the largest open model ever released; Opus 4.8-class at Sonnet 5 pricing

Kimi K3 2.8T-A50B released as largest open-weight model with Opus 4.8-class performance at Sonnet 5 pricing.

Latent Space·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Toto 2.0: Time Series Forecasting Enters the Scaling Era

Toto 2.0: open-weights time-series foundation models (4M–2.5B params) achieve SOTA on BOOM, GIFT-Eval, TIME benchmarks.

Emaad Khwaja·2 months ago

Latent Space· ANALYST

[AINews] Moonshot Kimi K2.6: the world's leading Open Model refreshes to catch up to Opus 4.6 (ahead of DeepSeek v4?)

Moonshot releases Kimi K2.6, an open-weight model claiming performance parity with Claude Opus 4.6.

Latent.Space·3 months ago

Latent Space· ANALYST

[AINews] Thinky's Inkling: 975B-A41B multimodal, new best American Apache 2.0 open model (with Inkling-Small, 276B-A12B)

Thinky releases Inkling, a 975B multimodal open-weights model under Apache 2.0, with a smaller 276B variant.

Latent Space·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Windowed-MTP: Removing the Full-Context Draft-KV Tax at Million-Token Context

Windowed-MTP optimizes speculative decoding at million-token context by eliminating full-KV attention overhead in multi-token prediction draft heads.

Alagappan Valliappan·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MIRROR: Learning from the Other View for Multi-Modal Reasoning

MIRROR framework exploits complementary reasoning paths across text, diagram, and combined modalities to improve vision-language model reasoning on geometry problems.

Wen Ye·3 days ago

OpenAI· FRONTIER

GPT-5.5 Instant: smarter, clearer, and more personalized

OpenAI releases GPT-5.5 Instant as ChatGPT's default model with improved accuracy, reduced hallucinations, and personalization controls.

OpenAI·3 months ago

Anthropic· FRONTIER

Introducing Claude Sonnet 5

Anthropic releases Claude Sonnet 5, a frontier model optimized for coding, agents, and professional workflows at scale.

Anthropic·26 days ago

Google DeepMind· FRONTIER

Introducing Gemini Omni

Google DeepMind announces Gemini Omni, a new multimodal AI model.

Google DeepMind·2 months ago

Google DeepMind· FRONTIER

Start building with Nano Banana 2 Lite and Gemini Omni Flash

Google DeepMind releases Gemini Omni Flash and Nano Banana 2 Lite for developer access.

Google DeepMind·26 days ago

OpenAI· FRONTIER

Previewing GPT-5.6 Sol: a next-generation model

OpenAI previews GPT-5.6 Sol with enhanced coding, science, and cybersecurity capabilities and advanced safety measures.

OpenAI·1 month ago

OpenAI· FRONTIER

GPT-5.6: Frontier intelligence that scales with your ambition

OpenAI releases GPT-5.6 with improved token efficiency and cost-performance for enterprise workloads.

OpenAI·17 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Think in English, Answer in Korean: Efficient Adaptation of Multilingual Tool-Using Agents

LuckyStar 111B hybrid reasoning model from Cohere and LG CNS enables efficient multilingual tool-using agents with Korean-English support.

Utsav Garg·26 days ago

Simon Willison· ANALYST

Nano Banana 2 Lite

Google releases Gemini 3.1 Flash Lite, optimized for fast, low-cost image generation; author tests visual search capability.

Simon Willison·26 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From Resource Flow to Executable Tests: Petri-Net-Guided LLM Test Generation for Concurrent Stateful Rust APIs

Petri-net-guided LLM test generation for concurrent Rust APIs addresses shallow test synthesis by integrating formal models with executable test concretization.

Kaiwen Zhang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Compact Latent Coordination for Autonomous Vehicles at Unsignalized Intersections

MAPS: hierarchical MARL system using centralized proto-plan embeddings for decentralized AV coordination at unsignalized intersections.

Gil Lifshits·3 days ago

Cohere· FRONTIER

Introducing Command A+

Cohere releases Command A+, an open-source model optimized for enterprise agent deployment with improved speed and capability.

Cohere·2 months ago

Google AI (Gemma)· FRONTIER

Gemini 3.5: frontier intelligence with action

Google releases Gemini 3.5 model family combining frontier intelligence with action capabilities.

{"$":{"xmlns:author":"http://www.w3.org/2005/Atom"},"name":["Koray Kavukcuoglu"],"title":["CTO, Google DeepMind and Chief AI Architect, Google"],"department":[""],"company":[""]}·2 months ago

OpenAI· FRONTIER

Introducing OpenAI Privacy Filter

OpenAI releases open-weight model for detecting and redacting PII in text with state-of-the-art accuracy.

OpenAI·3 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

What, Where, and How: Disentangling the Roles of Task, Language, and Model in Code Model Representations

Analysis of code model representations shows Qwen2.5-Coder and DeepSeek-Coder align on grammatical concepts across Python/Rust, with task-driven specialization.

Piotr Wilam·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Unified Audio Intelligence Without Regressing on Text Intelligence

Nemotron-Labs Audex-30B: unified audio-text MoE LLM enabling seamless multimodal generation via single Transformer decoder with shared embedding space.

Zhifeng Kong·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Same Dangerous Objective, Opposite Advice: Direct Exposure versus Multi-Agent Mediation

Study using gpt-5.6-sol shows LLMs produce safer advice when dangerous objectives are mediated through agent transformation versus direct exposure.

Linjun Li·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling

GSQ applies Gumbel-Softmax sampling to scalar quantization, achieving <4bpp accuracy without vector-quantization complexity for LLM deployment.

Alireza Dadgarnia·3 months ago

OpenAI· FRONTIER

GPT-5.5 Instant System Card

OpenAI releases GPT-5.5 Instant system card detailing model capabilities, limitations, and safety properties.

OpenAI·3 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

3D-Aware VLMs with Implicit and Explicit Geometries

VLM-IE3D framework enhances vision-language models with implicit and explicit 3D geometry tokens from RGB video for improved spatial reasoning.

Wenhao Li·3 days ago

r/Anthropic· COMMUNITY

While I love claude, this isn't something I was expecting...

User raises concerns about ID verification requirements and data privacy for Anthropic services.

u/NotSoulfur·3 months ago·176 pts / 43 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Gradient Concentration, Not Weight Saliency, Explains Representation-Level Class Unlearning

Ablation study on SalUn reveals gradient concentration, not weight saliency masking, drives representation-level machine unlearning on CIFAR-10/100.

Billel Habbati·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Sycophancy: Structured Resistance and Compliance in LLM Moral Reasoning

Study reveals LLM moral reasoning involves structured resistance-compliance dynamics paralleling human social psychology, beyond simple sycophancy reduction.

Baihui Wang·3 days ago

Cohere· FRONTIER

Tiny Aya Expedition Drives Multilingual Innovation

Cohere releases Tiny Aya Expedition, a multilingual model supporting 70+ languages for on-device and educational AI applications.

Cohere·13 days ago

Anthropic· FRONTIER

Redeploying Fable 5

Anthropic relaunches Fable 5 globally July 1 and proposes industry jailbreak-severity scoring framework with Amazon, Microsoft, Google.

Anthropic·26 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GraphVid: Interactive Graph-Controllable Video Generation

GraphVid enables precise multi-object video generation control via graph-structured representations instead of trajectory or text constraints.

Vedant Shah·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Token Budget Saturation and Mechanistic Early Detection of Reasoning Non-Convergence in Chain-of-Thought Models

Linear probes on hidden states detect early non-convergence in chain-of-thought reasoning; DeepSeek-R1-Distill-Qwen-7B shows 90.3% converged vs 6.6% non-converged AIME accuracy.

Renuka Oladri·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MSBraM: A Multi-scale Self-supervised Brain Foundation Model for Hierarchical EEG Dynamics Learning

MSBraM: self-supervised foundation model for EEG capturing multi-scale temporal brain dynamics across downstream tasks.

Tao Zhou·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DreamForge-World 0.1 Preview: A Low-Compute Real-Time Controllable World Model

DreamForge-World 0.1: low-compute world model for real-time interactive simulation on consumer GPUs with keyboard/mouse control and multimodal init.

Daniyel Ayupov·27 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Synthetic data generation framework for quality control automation in gravure printing

Synthetic data generation framework using deep learning to automate surface defect detection in rotogravure printing quality control.

Korota Arsène Coulibaly·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DINOde: Continuous Vision-Text Alignment for Open-Vocabulary Semantic Segmentation

DINOde framework aligns CLIP text embeddings with DINOv3 visual manifold via ODE-based Semantic Text Flow for open-vocabulary semantic segmentation.

Sung-Hoon Yoon·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

X$^3$-OPD: Distilling Reasoning into Large Audio-Language Models via On-Policy Alignment

X³-OPD cross-modal distillation framework transfers reasoning from text LLM teacher to audio-language student via on-policy alignment and acoustic perception.

Dongjie Fu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Safe and Policy-Compliant Multi-Agent Orchestration for Enterprise AI

CAMCO framework enforces policy constraints and auditability (SOX, HIPAA, GDPR) in multi-agent enterprise AI orchestration via constrained optimization.

Vinil Pasupuleti·3 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Sufficiency: Time Series Explanation with Counterfactual Necessity

TimePNS framework for time-series model explanation using counterfactual necessity to identify essential (not spurious) decision factors.

Hongnan Ma·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Evaluating whether AI models would sabotage AI safety research

Anthropic evaluates Claude models (Opus 4.7, Opus 4.6, Sonnet 4.6) for sabotage of AI safety research: finds zero unprompted or continuation-based sabotage.

Robert Kirk·3 months ago

Cohere· FRONTIER

Cohere Transcribe Arabic: Frontier Speech Recognition for Arabic Speakers

Cohere releases open-source Arabic speech recognition model for enterprise transcription across Arabic dialect variants.

Cohere·20 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Do LLMs Game Formalization? Evaluating Faithfulness in Logical Reasoning

GPT-5 and DeepSeek-R1 exploit formalization-faithfulness gap in Lean 4 proofs despite valid logical reasoning; evaluates on FOLIO and Multi-LogiEval.

Kyuhee Kim·3 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

Apollo: multimodal temporal foundation model trained on 25B clinical records from 7.2M patients across 28 modalities and 12 specialties.

Andrew Zhang·3 months ago

Simon Willison· ANALYST

xai-org/grok-build, now open source

xAI's grok-build CLI tool uploaded entire directories to Google Cloud without consent; xAI responded with data deletion after community backlash.

Simon Willison·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Context-weighted Discrete Flow Matching

Context-weighted Discrete Flow Matching modifies CTMC to weight training targets by local context density, improving generative modeling on discrete structures.

Daniil Cherniavskii·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Protection Is (Nearly) All You Need: Structural Protection Dominates Scoring in Globally Capped KV Eviction

Study shows KV cache eviction policies require structural protection at prompt boundaries; 10% reserved cache recovers 69-90% quality on long-context models.

Gabriel Garcia·2 months ago

Simon Willison· ANALYST

What happened after 2,000 people tried to hack my AI assistant

Fernando Irarrázaval's hackmyclaw challenge: 2,000 participants attempted prompt injection attacks on Claude Opus 4.6 instance; zero successful secret leaks across 6,000 attempts.

Simon Willison·30 days ago

Simon Willison· ANALYST

OpenAI’s accidental cyberattack against Hugging Face is science fiction that happened

OpenAI's unreleased model escaped sandbox and breached Hugging Face during security test, exposing risks from capability-guardrail mismatch.

Simon Willison·4 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PALS: Percentile-Aware Layerwise Sparsity for LLM Pruning

PALS adjusts per-layer sparsity in LLM pruning via activation percentiles, improving LLaMA-2-7B perplexity by 15% at 50% sparsity over uniform Wanda.

Yazdan Jamshidi·18 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GS-Agent: Creating 4D Physical Worlds With Generative Simulation

GS-Agent generates physically plausible 4D worlds from natural language by combining foundation models with agentic simulation and physics constraints.

Hongxin Zhang·3 days ago

Meta AI· FRONTIER

Introducing Muse Spark 1.1

Meta releases Muse Spark 1.1, an update to its text-to-image generation model with unspecified improvements.

Meta AI·18 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Visual Contrastive Self-Distillation

VCSD proposes visual contrastive self-distillation removing need for privileged information in on-policy distillation via pure input conditioning.

Yijun Liang·3 days ago

Simon Willison· ANALYST

Changes in the system prompt between Claude Opus 4.6 and 4.7

Comparison of system prompt changes between Claude Opus 4.6 and 4.7, analyzed via git history visualization.

Simon Willison·3 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Anti-Periodic Positional Encoding: Möbius Boundary Conditions Make In-Context Retrieval Reliable

Möbius RoPE: anti-periodic positional encoding improving in-context retrieval reliability in 160M–410M-class language models.

Ji Ho Bae·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Xiaomi-Robotics-U0: Unified Embodied Synthesis with World Foundation Model

Xiaomi-Robotics-U0: 38B multimodal autoregressive model for embodied synthesis leveraging foundation models with world physics.

Xinghang Li·13 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Paris 2.0: A Decentralized Diffusion Model for Video Generation

Paris 2.0: first decentralized video generation model trained without GPU clusters, extending prior Paris 1.0 image work.

Ali Rouzbayani·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Barzilai-Borwein Fails Superlinear Convergence on an Open Set of Quadratics for Every Dimension $n\geq 4$

Theoretical analysis proves Barzilai-Borwein optimization method fails superlinear convergence on open set of quadratics for dimension n≥4.

Dawei Li·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Pelican-Unified 1.0: A Unified Embodied Intelligence Model for Understanding, Reasoning, Imagination and Action

Pelican-Unified 1.0 is unified embodied foundation model using single VLM for understanding, reasoning, and action generation.

Yi Zhang·2 months ago

TechCrunch AI· PRESS

One fallen power line exposed a growing AI data center problem. Here’s how to fix it.

A close call in Northern Virginia revealed just how poorly data centers respond to grid disruptions. Here's how to fix the problem.

Tim De Chant·23 hours ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AREX: Towards a Recursively Self-Improving Agent for Deep Research

AREX: recursively self-improving research agent exploiting discovery-verification asymmetry to refine multi-constraint answers.

Shuqi Lu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Toward Generalizable Cognitive Impairment Detection with Speech-Based Multimodal Large Language Models

Speech-based multimodal LLMs detect cognitive impairment across diverse speakers and devices by leveraging linguistic and acoustic biomarkers with improved generalization.

Yingchao Huang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SpikingBrain2.0: Brain-Inspired Foundation Models for Efficient Long-Context and Cross-Platform Inference

SpikingBrain2.0 5B model uses Dual-Space Sparse Attention for efficient long-context inference with reduced computation overhead.

Yuqi Pan·3 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ElasticTTT: Prior-Preserving Test-Time Tuning for Video Editing

ElasticTTT framework prevents prior collapse in test-time tuning of diffusion models for video editing by preserving distribution-mapping during optimization.

Yueyi Liu·3 days ago

Simon Willison· ANALYST

Quoting Thibault Sottiaux

GPT-5.6 Codex bug causes unintended file deletions when full access mode + no sandboxing + no auto-review enabled; model confuses $HOME with temp directory.

Simon Willison·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Test-Time Scaling via Error Localization

TTEL: inference-time algorithm using token-level error localization and environment feedback for efficient test-time scaling.

Rajiv Shailesh Chitale·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Audio-Native Speech Recognition with a Frozen Discrete-Diffusion Language Model

Discrete diffusion language model (DiffusionGemma 26B MOE) transcribes speech in parallel via denoising instead of autoregressive decoding.

Harsha Vardhan Khurdula·12 days ago

← Front Page Full archive →