The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Data Presentation Over Architecture: Resampling Strategies for Credit Risk Prediction with Tabular Foundation Models

Benchmark of tabular foundation models on credit default prediction shows context strategy matters more than model choice.

Aditya Tanna·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Position: Weight Space Should Be a First-Class Generative AI Modality

Position paper proposes treating neural network checkpoints as a first-class generative modality for on-demand weight synthesis.

Zhangyang Wang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SCICONVBENCH: Benchmarking LLMs on Multi-Turn Clarification for Task Formulation in Computational Science

SCICONVBENCH benchmarks LLMs on multi-turn clarification dialogues for ill-posed scientific task formulation.

Nithin Somasekharan·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Aligned Training: A Parameter-Free Method to Improve Feature Quality and Stability of Sparse Autoencoders (SAE)

Aligned training, a parameter-free SAE reparameterization, eliminates dead features and improves interpretability stability.

Michał Brzozowski·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning Lifted Action Models from Traces with Minimal Information About Actions and States

Method for learning lifted STRIPS+ action models from traces with minimal state/action information assumptions.

Jonas Gösgens·1 month ago

r/ClaudeAI· COMMUNITY

What is happening

Reddit thread with no content; insufficient information to assess technical or business significance.

u/Strategy-Savings·1 month ago·20 pts / 19 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection

CVAE framework enables targeted evasion of ML-based malware detectors via API import injection without retraining.

Juozas Dautartas·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

An Approximation Algorithm for Graph Label Selection

First Õ(log^1.5 n)-approximation algorithm for graph label selection under budget constraint.

Josia John·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark

CrossView Suite adds dataset, model, and benchmark for cross-view spatial reasoning in MLLMs with object-level consistency.

Wei Wang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Stochastic Penalty-Barrier Methods for Constrained Machine Learning

SPBM extends penalty-barrier methods to non-convex, non-smooth stochastic optimization for constrained deep learning.

Adam Bosák·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ManiSoft: Towards Vision-Language Manipulation for Soft Continuum Robotics

ManiSoft benchmark for vision-language manipulation of soft robotic arms with elastic dynamics simulation and contact-rich task evaluation.

Ziyu Wei·1 month ago

r/LocalLLaMA· COMMUNITY

Qwen cant wait to release 3.7 models

Community speculation about Qwen planning 3.7B parameter model releases; unconfirmed social media discussion.

u/GotHereLateNameTaken·1 month ago·141 pts / 37 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

SAME: A Semantically-Aligned Music Autoencoder

SAME autoencoder achieves 4096× music compression via transformer backbone and semantic regularization for audio generation.

Julian D. Parker·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CATA: Continual Machine Unlearning via Conflict-Averse Task Arithmetic

CATA framework for continual machine unlearning in vision-language models via conflict-averse task arithmetic.

Shen Lin·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Perfect Parallelization in Mini-Batch SGD with Classical Momentum Acceleration

Theoretical analysis of classical momentum acceleration in mini-batch SGD for large-scale model training.

Sachin Garg·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Forecasting Downstream Performance of LLMs With Proxy Metrics

Proxy metrics from token-level statistics predict downstream LLM capabilities faster than cross-entropy loss.

Arkil Patel·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Physics-Aligned Canonical Equivariant Fourier Neural Operator under Symmetry-Induced Shifts

PACE-FNO uses Lie-algebra symmetries to improve neural operator generalization on PDE solution maps.

Jiaxiao Xu·1 month ago

r/Anthropic· COMMUNITY

Fraudulent Anthropic Charges

Anyone else also dealing with random $5.44 charges from Claude???? I never paid for Claude or had my card hacked before. I canceled my card but really strange

u/nanamymelody·1 month ago·11 pts / 3 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Pointwise Generalization in Deep Neural Networks

Pointwise Riemannian Dimension framework characterizes deep network generalization via learned feature representation geometry.

Shaojie Li·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Latent Action Reparameterization for Efficient Agent Inference

LAR compresses LLM agent action spaces into latent multi-step behaviors to reduce inference cost and decision horizon.

Wenhao Huang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Not What You Asked For: Typographic Attacks in Household Robot Manipulation

Typographic attacks via printed text override CLIP-based perception in simulated household robot manipulation pipelines.

Ali Iranmanesh·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning

AMARIS accumulates evaluation diagnostics across RL steps to adaptively improve rubric-based reward shaping for LLM fine-tuning.

Peilin Wu·1 month ago

MIT Tech Review· PRESS

Inside Anduril and Meta’s quest to make smart glasses for warfare

The defense-tech company Anduril has shared new details about the augmented-reality headset for the military it’s prototyping with Meta, including a vision for ordering drone strikes via eye-tracking and voice commands. Quay Barnett, who leads the efforts as a vice president at Anduril following a career in the Army’s Special Operations Command, says his fundamental…

James O'Donnell·1 month ago

Hugging Face· INFRA

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Hugging Face·1 month ago

The Verge AI· PRESS

Amazon Alexa Plus can now create AI-generated podcasts

Alexa Plus, Amazon's upgraded AI assistant, can now generate podcasts on "virtually any topic," according to an announcement on Monday. With the update, Amazon says you can give Alexa Plus a topic, and the AI assistant will offer an overview of what its AI hosts plan to talk about, allowing you to steer the conversation and adjust its length before it starts generating the episode. Some "Alexa Podcast" examples shared by Amazon have two AI-generated hosts talking about the history of the Roman Empire, new music, and expectations for the World Cup. Amazon says you can also ask Alexa Plus to ge...

Emma Roth·1 month ago

r/LocalLLaMA· COMMUNITY

Qwen 35b a3b surprises me

Just wanted to share that I'm pretty happy about Qwen 35b a3b agentic coding performance. I'm running the model in q80 quant, kv cache both q8\_0 as well, with 262144 in 4090 + 5060 ti, via llama.cpp backend with claude code pointing to localhost. For demo/data analytics purposes, it works pretty well. I haven't used it for large codebases, but it definitely is better than gemma4 26b in my use case. One thing that surprises me is that it seems to get better outcome in agentic coding, than chat. When using it with just chat UI, i found the code qwen35b provide a bit too clunky. I wonder o...

u/siegevjorn·1 month ago·40 pts / 33 comm

r/ClaudeAI· COMMUNITY

11 Claude things I wish someone had told me 12 months ago

Experienced Claude user shares 11 practical tips on Projects, Custom Styles, and Claude Code features after 18 months daily use.

u/No-Yogurtcloset4086·1 month ago·68 pts / 13 comm

Hugging Face· INFRA

PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

Hugging Face·1 month ago

r/ClaudeAI· COMMUNITY

Claude Code helped me bring my dead passion project back to life

Creator of HeroMachine character generator used Claude Code to complete a long-stalled project over a weekend.

u/AFDStudios·1 month ago·26 pts / 11 comm

r/LocalLLaMA· COMMUNITY

Qwen 3.7 droped on Qwen Chat

Qwen 3.7 released on Qwen Chat platform, continuing open-weights model availability from Alibaba.

u/Foxiya·1 month ago·88 pts / 37 comm

← Front Page30 stories

← Newer Older →