The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory

OCR-Memory uses visual modality for dense long-horizon agent memory retrieval, reducing token costs vs. text-only summarization.

Jinze Li·2 months ago

r/ClaudeAI· COMMUNITY

I tested Claude + Blender MCP for real 3D workflows and here's the honest result

Saw a lot of hype around Blender MCP this week so I decided to actually test it with two real workflows instead of just reading about it. **Test 1: Build a scene from scratch** Typed one sentence describing a cyberpunk room. Claude handled the geometry, lighting, camera and render settings. Never touched a menu. Not everything in the prompt landed perfectly and this was a simple scenario — results will vary with anything more complex. But for basic setup work it was fast. **Test 2: Clean up a photogrammetry scan** Threw a raw KIRI Engine photogrammetry scan at it. Massey Ferguson tracto...

u/KIRI_Engine_App·2 months ago·22 pts / 5 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis

Multilingual ABSA evaluation across seven languages benchmarks transformer and instruction-tuned models under zero-shot and full-resource settings.

Jakob Fehle·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TDD Governance for Multi-Agent Code Generation via Prompt Engineering

AI-native TDD framework operationalizes test-driven development as governance constraints in LLM-based multi-agent code generation.

Tarlan Hasanli·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Human-in-the-Loop Benchmarking of Heterogeneous LLMs for Automated Competency Assessment in Secondary Level Mathematics

Human-in-the-loop benchmarking framework evaluates LLMs on automated competency assessment for secondary mathematics using rubric-based evaluation.

Jatin Bhusal·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Who Trains Matters: Federated Learning under Enrollment and Participation Selection Biases

Study quantifies enrollment and participation selection biases in federated learning that violate population representativeness assumptions.

Gota Morishita·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Translating Under Pressure: Domain-Aware LLMs for Crisis Communication

Domain-adaptive pipeline fine-tunes small LMs for crisis translation via data retrieval, filtering, and preference optimization toward A2-level English.

Antonio Castaldo·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty

Physics-guided graph neural ODEs for state estimation in digital twins under model uncertainty and sparse sensing.

Marcus Haywood-Alexander·2 months ago

r/LocalLLaMA· COMMUNITY

Qwen Introduced FlashQLA

Qwen releases FlashQLA, linear attention kernels delivering 2–3× forward and 2× backward speedup for on-device agentic inference via TileLang optimization.

u/ResearchCrafty1804·2 months ago·69 pts / 17 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

MappingEvolve: LLM-Driven Code Evolution for Technology Mapping

LLM-driven multi-agent framework (Planner, Evolver, Evaluator) evolves logic synthesis technology mapping code via evolutionary search.

Rongliang Fu·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Star-Fusion: A Multi-modal Transformer Architecture for Discrete Celestial Orientation via Spherical Topology

Multi-modal transformer for spacecraft celestial orientation via spherical topology, replacing traditional Lost-in-Space algorithms.

May Hammad·2 months ago

Ars Technica AI· PRESS

Sam Altman is “the face of evil” for not reporting school shooter, says lawyer

Lawsuits: OpenAI didn't report ChatGPT user to cops to protect Altman, IPO.

Ashley Belanger ·2 months ago

Mistral AI· FRONTIER

Remote agents in Vibe. Powered by Mistral Medium 3.5.

Mistral AI launches Mistral Medium 3.5 with remote coding agents in Vibe and Work mode in Le Chat for complex tasks.

Mistral AI·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Graph Construction and Matching for Imperative Programs using Neural and Structural Methods

Pipeline converting imperative programs to typed graphs using AST parsing and semantic embeddings (SentenceTransformer, CodeBERT) for verification artifact reuse.

Arshad Beg·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control

Benchmark evaluating 72 LLMs on 270 harmful instructions for robotic health attendant safety; mean violation rate 54.4% across models.

Mahiro Nakao·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PAINT: Partial-Solution Adaptive Interpolated Training for Self-Distilled Reasoners

Self-distillation method for LLM reasoning via partial-solution adaptive interpolation, balancing on-policy exploration with dense supervision.

Zhiquan Tan·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Advancing multi-site emission control: A physics-informed transfer learning framework with mixture of experts for carbon-pollutant synergy

Physics-informed transfer learning with mixture of experts for multi-site municipal waste incineration emission control under heterogeneous conditions.

Yuxuan Ying·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Multimodal LLMs are not all you need for Pediatric Speech Language Pathology

Hierarchical cascading approach for Speech Sound Disorder classification using fine-tuned speech representation models, outperforms multimodal LLMs on SLPHelmUltraSuitePlus benchmark.

Darren Fürst·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning to Route Electric Trucks Under Operational Uncertainty

Reinforcement learning framework for stochastic electric truck routing under battery constraints and charging infrastructure uncertainty.

Stavros Orfanoudakis·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation

AI Council framework mitigates artificial consensus in multi-LLM policy simulation via architectural heterogeneity and coherence validation across value perspectives.

Ariel Sela·2 months ago

r/ClaudeAI· COMMUNITY

my claude prompts are embarrassingly short now

Reddit user reports that shorter, focused system prompts outperform long instruction blocks with Claude, challenging conventional multi-thousand-word prompt engineering.

u/Turbulent-Pay7073·2 months ago·24 pts / 15 comm

r/singularity· COMMUNITY

That robot demo almost turned into a nightmare

Anecdotal robot incident from r/singularity with no substantive details or technical context.

u/Simple3018·2 months ago·101 pts / 56 comm

The Verge AI· PRESS

China freezes new robotaxi licenses after Baidu chaos

A Baidu Apollo Go robotaxi in Wuhan, China. | Image: Bloomberg via Getty Images China has suspended new licenses for autonomous vehicles, Bloomberg reports, citing unnamed people familiar with the matter. The move comes after dozens of robotaxis operated by Chinese tech giant Baidu ground to a halt in traffic last month in Wuhan, creating chaos. The restrictions will prevent companies from adding new driverless cars to their fleets, expanding into new cities, or starting new test projects. It is unclear when officials will start issuing new licenses again. Bloomberg said the Wuhan incident al...

Robert Hart·2 months ago

r/LocalLLaMA· COMMUNITY

If anyone is running qwen 9b or 27b or 35b and getting wrong facts while web search, follow this.

Troubleshooting guide for web search accuracy in Qwen 9B/27B/35B using searXNG, Firecrawl, Jina, and agent prompts.

u/9r4n4y·2 months ago·41 pts / 19 comm

r/ClaudeAI· COMMUNITY

When you've got money to burn 😂

u/InsideSignal9921·2 months ago·28 pts / 5 comm

The Verge AI· PRESS

GitHub rushed to fix a critical vulnerability in less than six hours

GitHub employees fixed a critical remote code execution vulnerability in less than six hours last month. Wiz Research used AI models to uncover a vulnerability in GitHub's internal git infrastructure that could have allowed attackers to access millions of public and private code repositories. "Our security team immediately began validating the bug bounty report. Within 40 minutes, we had reproduced the vulnerability internally and confirmed the severity," explains Alexis Wales, GitHub chief information security officer. "This was a critical issue that required immediate action." GitHub's engi...

Tom Warren·2 months ago

Stratechery· ANALYST

Intel Earnings, Intel’s Differentiation?, Whither Terafab

Intel earnings driven by AI CPU demand surge; analysis of competitive positioning and Terafab strategy questioned.

Ben Thompson·2 months ago

r/ClaudeAI· COMMUNITY

The final nail in the coffin for entry level creative freelancers just dropped

Anthropic released Blender MCP connector enabling Claude to directly control Blender via Python API for real-time 3D scene generation and modification.

u/Legitimate_Aerie_606·2 months ago·40 pts / 23 comm

TechCrunch AI· PRESS

Coby Adcock’s Scout AI raises $100 million to train its models for war. We visited its bootcamp

We visited Scout AI's training ground where it's working on AI agents that give individual soldiers control of fleets of autonomous vehicles.

Tim Fernholz·2 months ago

r/singularity· COMMUNITY

Just let one of my robots "test" the other robot. The loop is closing!

Reddit post about personal robot testing setup; anecdotal, lacks technical detail or reproducibility.

u/LKama07·2 months ago·108 pts / 20 comm

← Front Page30 stories

← Newer Older →