Source · Academic

arXiv (cs.AI/CL/LG)

arXiv · ACADEMIA

Last updated Jul 26, 2026, 12:30 PM

3D-Aware VLMs with Implicit and Explicit Geometries

VLM-IE3D framework enhances vision-language models with implicit and explicit 3D geometry tokens from RGB video for improved spatial reasoning.

Wenhao Li·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Expanding Flow Maps

Expanding Flow Maps (EFMs) enable flow-based generative models to handle variable-dimensionality distributions via expanding interpolants with conditional noise.

Sophia Tang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GraphVid: Interactive Graph-Controllable Video Generation

GraphVid enables precise multi-object video generation control via graph-structured representations instead of trajectory or text constraints.

Vedant Shah·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Barzilai-Borwein Fails Superlinear Convergence on an Open Set of Quadratics for Every Dimension $n\geq 4$

Theoretical analysis proves Barzilai-Borwein optimization method fails superlinear convergence on open set of quadratics for dimension n≥4.

Dawei Li·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Synthetic data generation framework for quality control automation in gravure printing

Synthetic data generation framework using deep learning to automate surface defect detection in rotogravure printing quality control.

Korota Arsène Coulibaly·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Surprisal Theory is Tautological (without Rational Grounding)

Philosophical critique: surprisal theory's linguistic difficulty predictions are tautological without constraints on language model specification.

Ryan Cotterell·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Sufficiency: Time Series Explanation with Counterfactual Necessity

TimePNS framework for time-series model explanation using counterfactual necessity to identify essential (not spurious) decision factors.

Hongnan Ma·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MedGame: Storytelling Gamification Empowered by Large Language Models for Medical Education

MedGame transforms static clinical cases into interactive decision-driven learning games using LLMs and dual narrative/director engines.

Qian Wu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Graph Learning on Ensembles of Cyclic Peptides: An Investigation of Molecular Ensemble Modeling

EnsembleEGNN molecular foundation model encodes conformational ensembles of cyclic peptides using equivariant GNNs with set attention pooling.

Aaron Feller·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Unsupervised Consensus-Based Anomaly Detection for Spatiotemporal Malaria Incidence in Ghana

Consensus anomaly detection applied to Ghana malaria surveillance data identifies spatiotemporal hotspots in Ashanti, Northern regions 2014-2023.

T. Ansah-Narh·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Sycophancy: Structured Resistance and Compliance in LLM Moral Reasoning

Study reveals LLM moral reasoning involves structured resistance-compliance dynamics paralleling human social psychology, beyond simple sycophancy reduction.

Baihui Wang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

OpenForgeRL: Train Harness-native Agents in Any Environment

OpenForgeRL enables end-to-end training of harness-native agents with open infrastructure, addressing limitation of complex inference harnesses like Claude Code.

Xiao Yu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Visual Contrastive Self-Distillation

VCSD proposes visual contrastive self-distillation removing need for privileged information in on-policy distillation via pure input conditioning.

Yijun Liang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MIRROR: Learning from the Other View for Multi-Modal Reasoning

MIRROR framework exploits complementary reasoning paths across text, diagram, and combined modalities to improve vision-language model reasoning on geometry problems.

Wen Ye·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

X$^3$-OPD: Distilling Reasoning into Large Audio-Language Models via On-Policy Alignment

X³-OPD cross-modal distillation framework transfers reasoning from text LLM teacher to audio-language student via on-policy alignment and acoustic perception.

Dongjie Fu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Neural solutions of coupled ghost and gluon Dyson--Schwinger equations in Landau gauge

Neural networks solve coupled Dyson-Schwinger equations for Yang-Mills gauge theory with percent-level agreement to fixed-point solutions.

Rodrigo Carmo Terin·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Boundaries of Automation: A Theory of Persistent Human Participation

Theory paper argues human participation persists in automated systems for technical, complementarity, and normative reasons beyond current AI capability limits.

Fares Fourati·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Zero-Flow Two-Sample Tests

Zero-Flow Two-Sample Test uses learned directional misalignment patterns for distribution testing, separating witness learning from hypothesis evaluation.

Yakun Wang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DONDO: Open w2v-BERT Speech-Recognition Base Models for African Languages

DONDO releases 26 open w2v-BERT speech recognition models for African languages spanning six countries, trained on religious text corpora.

Paul Azunre·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Windowed-MTP: Removing the Full-Context Draft-KV Tax at Million-Token Context

Windowed-MTP optimizes speculative decoding at million-token context by eliminating full-KV attention overhead in multi-token prediction draft heads.

Alagappan Valliappan·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From Resource Flow to Executable Tests: Petri-Net-Guided LLM Test Generation for Concurrent Stateful Rust APIs

Petri-net-guided LLM test generation for concurrent Rust APIs addresses shallow test synthesis by integrating formal models with executable test concretization.

Kaiwen Zhang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ElasticTTT: Prior-Preserving Test-Time Tuning for Video Editing

ElasticTTT framework prevents prior collapse in test-time tuning of diffusion models for video editing by preserving distribution-mapping during optimization.

Yueyi Liu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

GS-Agent: Creating 4D Physical Worlds With Generative Simulation

GS-Agent generates physically plausible 4D worlds from natural language by combining foundation models with agentic simulation and physics constraints.

Hongxin Zhang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Same Dangerous Objective, Opposite Advice: Direct Exposure versus Multi-Agent Mediation

Study using gpt-5.6-sol shows LLMs produce safer advice when dangerous objectives are mediated through agent transformation versus direct exposure.

Linjun Li·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Improved lower bounds for the Shannon capacity of odd cycles

Improved lower bounds for Shannon capacity of odd cycles via independent set construction in graph powers—pure graph theory unrelated to AI.

Nathaniel Itty·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Agentic Context Management: Solving Agent Memory and Cost by Treating Them as Lifecycle and Architecture Problems

Agentic context management frames token cost and memory bloat as lifecycle and architecture problems, not storage-retrieval, for production agent reliability.

Gaurav Dadhich·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Artificial Epanorthosis: Why large language models overuse a classical rhetorical figure, and how to mitigate it

LLMs systematically overuse epanorthosis (classical self-correction rhetoric) due to promotional training distributions and RLHF preference for emphatic phrasing.

Federico Boggia·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Toward Generalizable Cognitive Impairment Detection with Speech-Based Multimodal Large Language Models

Speech-based multimodal LLMs detect cognitive impairment across diverse speakers and devices by leveraging linguistic and acoustic biomarkers with improved generalization.

Yingchao Huang·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Toward Continuous Assurance for the Democratization of AI Agent Creation in Industry

No-code agent platforms create reliability gaps—silent degradation from changing models, tools, permissions, and dependencies—requiring continuous assurance frameworks.

Natan Levy·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

What, Where, and How: Disentangling the Roles of Task, Language, and Model in Code Model Representations

Analysis of code model representations shows Qwen2.5-Coder and DeepSeek-Coder align on grammatical concepts across Python/Rust, with task-driven specialization.

Piotr Wilam·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Compact Latent Coordination for Autonomous Vehicles at Unsignalized Intersections

MAPS: hierarchical MARL system using centralized proto-plan embeddings for decentralized AV coordination at unsignalized intersections.

Gil Lifshits·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Agentic coding without the cloud: evaluating open-weight large language models on longitudinal data preparation tasks

Open-source evaluation framework for open-weight LLM agents on longitudinal data tasks, addressing privacy constraints in research deployments.

Mack Nixon·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Finite-Sample Coverage Audits for High-Recall Candidate Generation: Certification and Learning-Theoretic Design

Label complexity bounds for auditing high-recall candidate generation pipelines with finite-sample validity guarantees.

Martin Anthony·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Error Certificates for KV-Cache Eviction via Randomized Design

Randomized KV-cache eviction with error certification via Hájek correction, proving deterministic eviction hides information loss.

Peng Xie·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Thinkink: 2D Spatial Ink-native Interaction with LLMs

Thinkink: 2D spatial interface integrating handwritten/sketch prompts with LLM responses via semantic tree interpretation.

Mohammad Hasan Payandeh·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AREX: Towards a Recursively Self-Improving Agent for Deep Research

AREX: recursively self-improving research agent exploiting discovery-verification asymmetry to refine multi-constraint answers.

Shuqi Lu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Detecting LLM-Generated Tokens in Human--LLM Coauthored Text

Token-level detection method for LLM-generated content in human-AI coauthored text using score smoothing.

Yangjun Lu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Test-Time Scaling via Error Localization

TTEL: inference-time algorithm using token-level error localization and environment feedback for efficient test-time scaling.

Rajiv Shailesh Chitale·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

RUMBA: Russian User Memory Benchmark

RUMBA: Russian benchmark for long-term LLM conversational memory with fine-grained taxonomy across temporal reasoning dimensions.

Elizaveta Shevtsova·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

KroQuant: Kronecker-Structured Block Transforms for Efficient Post-Training Quantization of Diffusion Transformers

KroQuant: Kronecker-structured block transforms for W4A4 post-training quantization of diffusion transformers with efficient inference.

Yann Bouquet·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Trivia Is Not Trivial: Everyday Knowledge Failures in Multilingual LLMs

TriviaRoomQA benchmark evaluates multilingual LLM performance on 3,300 culturally-grounded trivia questions across 6 European languages and long-tail knowledge.

Anna Mosolova·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Climate-resilient electric vehicle charging infrastructure for sustainable cities: An interpretable causal-ensemble framework for preventive maintenance and low-carbon mobility

FGDSE framework applies causal-ensemble methods to predict EV charging infrastructure faults under climate stress for preventive maintenance.

Cande Lian·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Agent-Guided Relational Concept Discovery: Toward Interpretable Surgical Margin Assessment

Concept-based agent-guided learning improves interpretability and generalization of deep learning models for surgical margin assessment via REIMS spectroscopy.

Nooshin Maghsoodi·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptive Identity Anchoring: Closed-Loop Keyframe Placement for Synthetic Paired Supervision in Video Face Swapping

Adaptive Identity Anchoring improves video face swapping by optimizing keyframe placement for synthetic paired supervision in identity transfer.

Logan Robbins·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Token Budget Saturation and Mechanistic Early Detection of Reasoning Non-Convergence in Chain-of-Thought Models

Linear probes on hidden states detect early non-convergence in chain-of-thought reasoning; DeepSeek-R1-Distill-Qwen-7B shows 90.3% converged vs 6.6% non-converged AIME accuracy.

Renuka Oladri·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Context-weighted Discrete Flow Matching

Context-weighted Discrete Flow Matching modifies CTMC to weight training targets by local context density, improving generative modeling on discrete structures.

Daniil Cherniavskii·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Semantic-Aware Task Clustering for Constructive and Cooperative Multi-Tasking

Semantic-aware task clustering for Cooperative Multi-Task Semantic Communication (CMT-SemCom) ensures constructive multi-tasking by aligning tasks post-initialization.

Ahmad Halimi Razlighi·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

An Evaluation Framework for Structured Audio Captions Validated by Controlled Perturbations

Multi-axis evaluation framework for structured audio captions on AudioCards dataset validates five orthogonal dimensions beyond flat text metrics.

Liang-Yuan Wu·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Bridging the Gap Between Plausibility and Admissibility: Constraint-Aware Flow Maps for Dynamic Graph Systems

Constraint-aware flow maps apply symbolic filtering, weighting, and repair to conditional diffusion models for dynamically feasible graph trajectory generation.

Michael Romei de Socio·3 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PATS: Policy-Aware Training Scaffolding for Agentic Reinforcement Learning

PATS reframes skills as dynamic training scaffolds for LLM agent reinforcement learning, converting rollout groups to reduce failure repetition in long-horizon tasks.

Yipeng Shi·3 days ago

← All Sources50 stories