The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Adversarial Vulnerability Under Temporal Concept Drift: A Longitudinal Study of Android Malware Detection

Decade-long study of Android malware detector adversarial robustness under temporal concept drift across deployment scenarios.

Ahmed Sabbah·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Benchmarking Google Embeddings 2 against Open-Source Models for Multilingual Dense Retrieval and RAG Systems

Google Embeddings 2 outperforms five open-source dense retrieval models on BEIR and RAG benchmarks but faces latency tradeoff.

Stefano Cirillo·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EM-Vid: Training-Free Entity-Centric Memory for Efficient and Consistent Multi-Shot Video Generation

Entity-centric latent patch memory for multi-shot video generation maintains character consistency without full-frame overhead.

Jente Vandersanden·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling

DiLaDiff uses latent diffusion and consistency model distillation to improve token correlation in masked diffusion language models.

Jean-Marie Lemercier·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Preisach Attention: A Hysteretic Model of Sequential Memory

Preisach Attention Layer applies hysteresis operator as O(1)-depth attention replacement, achieving Turing-completeness in single-layer Transformers.

Piotr Frydrych·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Structure-Guided Entity Resolution: Fine-Tuning LLMs for Robust Name Matching in Complex Linguistic Contexts

Fine-tuning LLMs for entity resolution in KYC via structure-guided matching across naming conventions and scripts.

Shivam Chourasia·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Cost-Effective Model Evaluation with Meta-Learning

MetaEvaluator: meta-learning framework for label-free, cost-effective evaluation of unseen models across architectures.

Trinh Pham·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Solving the Aircraft Disassembly Scheduling Problem

Scheduling algorithms for aircraft disassembly optimization with task precedence and certification constraints.

Charles Thomas·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Asymmetric Scaling Laws from Sparse Features

Scaling laws for sparse-activation neural networks show double-descent and asymmetric loss dynamics.

John Sous·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents

Co-ReAct integrates step-level rubrics to guide ReAct agent reasoning in multi-step search tasks.

Jiazheng Kang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

How Many Training Samples Are Needed for the Inverse Kinematics Solutions by Artificial Neural Networks

Study of training sample requirements for neural network inverse kinematics in robotic manipulators.

Dong-Won Lim·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Push Your Agent: Measuring and Enforcing Quantitative Goal Persistence in Long-Horizon LLM Agents

PushBench evaluates quantitative goal persistence in long-horizon LLM agents via work-unit completion.

Yuandao Cai·1 month ago

r/ClaudeAI· COMMUNITY

Aged like fine WINE

that meme on the chatgpt subreddit is so spot on ngl. even when you have requirements locked down managing the stack gets so weird. claude is an absolute beast at backend logic, teh reasoning depth is just insane now.the real mess starts when u try to scale past a basic landing page. forcing a single chat window to track complex UI layouts on top of everything just cooks the token limit and causes massive code drift. i ended up completely separating my enviornment to stop fighting the bottleneck. now i just let claude handle pure data pipelines, dump states into a quick db, and let stitch tak...

u/Happy_Macaron5197·1 month ago·31 pts / 6 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval

HARNESS-LM distills large embedding models into compact SLMs for low-latency sponsored search.

Vipul Gupta·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CP or DP? Why Not Both: A Case Study in the Partial Shop Scheduling Problem

Hybrid DP-CP approach for partial shop scheduling combines dynamic programming with constraint propagation.

Emma Legrand·1 month ago

r/ClaudeAI· COMMUNITY

After comparing Claude Max $100 and ChatGPT Pro $100 side by side on actual billable work, I'm cancelling my ChatGPT Pro subscription

User reports Claude Max outperforms ChatGPT Pro on accounting/taxation/legal tasks despite higher token costs; anecdotal quality comparison in Indian context.

u/MrNariyoshiMiyagi·1 month ago·33 pts / 24 comm

r/LocalLLaMA· COMMUNITY

[NEW] Supra-50M Released!

SupraLabs released Supra-50M, a 50M-parameter Llama-style language model trained on 20B educational tokens with competitive benchmark performance.

u/Dangerous_Try3619·1 month ago·44 pts / 18 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Understanding Goal Generalisation in Sequential Reinforcement Learning

Analysis of 100+ sequential RL training pipelines shows salient feature-driven generalization and goal persistence.

Jason Ross Brown·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MARS: Magnitude-Aware Rank Statistics

MARS improves model evaluation by weighting ranks with performance margins instead of discarding magnitude differences in Critical Difference diagrams.

Muhammad Rajabinasab·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ARMS: Automatic Reward Shaping for Sparse-Reward Multi-Agent Reinforcement Learning

ARMS auto-generates dense reward shaping signals for multi-agent RL via trajectory ranking without task-specific retraining.

Elie Abboud·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PathNavigate: A Training-Free Pathology Agent with Surprise-Guided Scan and Shared Slide Memory for Whole-Slide Image VQA

PathNavigate applies training-free agentic VQA to whole-slide pathology images using surprise-guided navigation and memory caching.

Chunze Yang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Is Dimensionality a Barrier for Retrieval Models?

Theoretical study of why low-dimensional embeddings scale to trillions of retrieval targets via maximal-margin analysis.

Kiril Bangachev·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Goal-Conditioned Agents that Learn Everything All at Once

Goal-conditioned RL agent leverages parallel all-goals learning by jointly predicting values and actions for every goal simultaneously.

Michael Matthews·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

RA-DCA: A Randomized Active-Set DCA for Directional Stationarity in Max-Structured DC Programs

RA-DCA proposes randomized active-set optimization for nonsmooth max-structured DC programs with directional stationarity guarantees.

Yi-Shuai Niu·1 month ago

r/OpenAI· COMMUNITY

AI-generated stories secretly won 3 of 5 fiction awards

u/EchoOfOppenheimer·1 month ago·50 pts / 35 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

When One Point Is Not Enough: Addressing Ambiguous Instances in Dimensionality Reduction by Splitting

Method addresses visual artifacts in dimensionality reduction by detecting and splitting ambiguous instances across multiple neighborhoods.

Diede P. M. van der Hoorn·1 month ago

r/Anthropic· COMMUNITY

MCP is quietly becoming Anthropic's most underrated contribution to AI

Most everyone focuses on Claude, the Constitutional AI Safety Research. However, I believe that the most practical impact from anything Anthropic has released to date may have been MCP. Given that MCP is a model-agnostic platform that is open-source, it allows developers who are not utilizing Claude to utilize it as well. Both OpenAI and Google are utilizing MCP. As such, MCP is being developed into the de-facto industry standard for connecting tools within artificial intelligence. I also find MCP shifts the bottleneck. Historically, getting an LLM to become smarter was the difficul...

u/kneekey-chunkyy·1 month ago·12 pts / 9 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Precise: SDE-Consistent Stochastic Sampling for RL Post-Training of Flow-Matching Models

Precise applies RL post-training to flow-matching models by designing SDE-consistent stochastic samplers that respect reverse ODE dynamics.

Jade Zou·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning partially observed systems with neural Hamiltonian ordinary differential equations

NHODE combines Hamiltonian neural networks with neural ODEs to learn partially observed dynamical systems without full state access.

Sunniva Meltzer·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DrawVideo: Generating Long Video from Storyboard Keyframe Sketches

DrawVideo generates long-form video from storyboard sketches, decomposing sequences into independently controllable shots via sketch/appearance/motion prompts.

Chuanzhi Xu·1 month ago

← Front Page30 stories

← Newer Older →