The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Unlearning Offline Stochastic Multi-Armed Bandits

First formal study of machine unlearning in offline multi-armed bandits with privacy and decision-quality tradeoffs.

Zichun Ye·2 months ago

Class Angular Distortion Index for Dimensionality Reduction

New metric (Class Angular Distortion Index) for evaluating cluster arrangement in dimensionality reduction visualizations.

Kaviru Gunaratne·2 months ago

r/OpenAI· COMMUNITY

At the trial, Elon wouldn't shut up about AI killing us all, so the judge banned the topic of extinction

Reddit post claims Elon Musk discussed AI extinction risk at trial; judge reportedly restricted the topic.

u/Confident_Salt_8108·2 months ago·102 pts / 26 comm

r/ClaudeAI· COMMUNITY

Opus 4.7

Reddit discussion about Claude Opus 4.7 with no substantive details provided.

u/NerdBanger·2 months ago·37 pts / 16 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis

BlenderRAG retrieval system improves LLM-to-Blender code generation success from 40.8% to 70% via multimodal examples.

Massimo Rondelli·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

H-RAG at SemEval-2026 Task 8: Hierarchical Parent-Child Retrieval for Multi-Turn RAG Conversations

H-RAG hierarchical parent-child retrieval pipeline for multi-turn RAG conversation tasks in SemEval-2026.

Passant Elchafei·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

EGREFINE: An Execution-Grounded Optimization Framework for Text-to-SQL Schema Refinement

EGREFINE frames database schema refinement as optimization to improve Text-to-SQL accuracy while preserving query equivalence.

Jiaqian Wang·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SC-Taxo: Hierarchical Taxonomy Generation under Semantic Consistency Constraints using Large Language Models

SC-Taxo generates hierarchical scientific taxonomies using LLMs with semantic consistency constraints across hierarchy levels.

Shiqiang Cai·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus

Study of embedding similarity invariance under machine translation across 28 languages using Manifesto Corpus.

Daria Boratyn·2 months ago

r/LocalLLaMA· COMMUNITY

MiMo-V2.5-Pro - the actual best open-weights model

Xiaomi's MiMo-V2.5-Pro and Kimi K2.6 dominate custom social deduction game benchmark, outperforming other open-weights models.

u/cjami·2 months ago·49 pts / 18 comm

r/LocalLLaMA· COMMUNITY

gemma-4-31B-it-DFlash has been released

gemma-4-31B-it-DFlash open-weights model released on Hugging Face, pending llama.cpp integration.

u/Total-Resort-3120·2 months ago·41 pts / 10 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Decouple before Integration: Test-time Synthesis of SFT and RLVR Task Vectors

Task vector analysis reveals structural conflicts (magnitude, sign, module-wise) preventing SFT-RLVR integration in LLMs.

Chaohao Yuan·2 months ago

r/singularity· COMMUNITY

Crazy that we’re still so early… and this is what “early” looks like

Reddit discussion expressing sentiment that AI progress remains in early stages without substantive technical claims or data.

u/aginext·2 months ago·104 pts / 36 comm

r/MachineLearning· COMMUNITY

[ECCV 2026] Review Discussion [D]

ECCV reviews should be out by 2nd May. Since no exact time was specified this year, they’ll likely be released sometime within the next 48 hours. Hopefully, the reviews go well for everyone. We can use this thread to discuss them, as I haven’t seen one started yet.

u/NGK12·2 months ago·36 pts / 7 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond Decodability: Reconstructing Language Model Representations with an Encoding Probe

Encoding probe method reconstructs LLM representations using interpretable features, avoiding confounds of decoding probes.

Gaofei Shen·2 months ago

r/LocalLLaMA· COMMUNITY

Qwen3.6-27B - Closed-loop SVG Images

User demonstrates closed-loop SVG generation using Qwen3.6-27B with Agno framework and vision feedback for iterative refinement.

u/dondiegorivera·2 months ago·44 pts / 15 comm

r/LocalLLaMA· COMMUNITY

Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB

User demonstrates DFlash speculative decoding in llama.cpp with Qwen3.5-35B-A3B on RTX 2080 SUPER 8GB, achieving inference on VRAM-constrained hardware.

u/jwestra·2 months ago·45 pts / 11 comm

The Verge AI· PRESS

Microsoft wants lawyers to trust its new AI agent in Word documents

Microsoft is launching a new AI agent inside Word that's specifically designed for legal teams. Legal Agent handles document edits, negotiation history, and complex documents to help legal teams handle tasks like reviewing contracts. "Instead of relying on general AI models to interpret commands, the agent follows structured workflows shaped by real legal practice, managing clearly defined, repeatable tasks like reviewing contracts clause by clause against a playbook," explains Sumit Chauhan, corporate vice president of Microsoft's Office Product Group. The Legal Agent can work with existing ...

Tom Warren·2 months ago

r/LocalLLaMA· COMMUNITY

By when do you think will TurboQuant get a proper release and be adopted by everyone

Reddit discussion speculating on TurboQuant adoption timeline and asymmetric K/V quantization gains.

u/Crystalagent47·2 months ago·42 pts / 49 comm

r/OpenAI· COMMUNITY

Giving Codex access to my MacBook/macOS

Reddit discussion about granting Codex API access to local macOS environment; user seeks opinions on security/feasibility.

u/infohoundloselose·2 months ago·63 pts / 10 comm

r/OpenAI· COMMUNITY

haha our model likes to talk about goblins no of course we dont know why, we dont know why the model does anything - yes we are trying to make a superintelligent machine god, maybe it will like goblins too, we have no way of knowing what it will like, we hope it will like humans

Reddit discussion of unexplained model behavior (goblin preference) and speculative commentary on AI alignment risks.

u/EchoOfOppenheimer·2 months ago·51 pts / 16 comm

r/ClaudeAI· COMMUNITY

Curious, how many of you actually click on Thought process / Ran a command to see whats going on?

Reddit discussion about user engagement with Claude's thinking process and command execution UI elements.

u/No_Abbreviations_429·2 months ago·20 pts / 23 comm

r/singularity· COMMUNITY

AI Outperforms ER Doctors in Diagnostic Cases, Study Points to Collaborative Care

Study reports AI system outperforms emergency room physicians in diagnostic accuracy, suggests collaborative clinical deployment model.

u/PhoenixRising656·2 months ago·121 pts / 21 comm

MIT Tech Review· PRESS

A new T-Mobile network for Christians aims to block porn and gender-related content

A new US-wide cell phone network marketed to Christians is set to launch next week. It blocks porn, which experts in network security say marks the first time a US cell plan has used network-level blocking for such content that can’t be turned off even by adult account owners. It’s also rolling out a filter…

James O'Donnell·2 months ago

r/ClaudeAI· COMMUNITY

Claude is hilariously petty

u/arihantismm·2 months ago·38 pts / 5 comm

r/LocalLLaMA· COMMUNITY

I hate this group but not literally

User documents personal journey running open-weights models locally on increasingly expensive hardware (M3 Ultra to RTX Pro 6000), testing Qwen, DeepSeek, Gemma, MiniMax.

u/No_Run8812·2 months ago·41 pts / 80 comm

r/LocalLLaMA· COMMUNITY

What in tarnation is going on with the cost of compute

GPU rental costs on Vast.ai and Mithril spike above $1k/hour for H100/H200/B200, raising affordability concerns for academic and startup ML development.

u/Party-Special-5177·2 months ago·41 pts / 50 comm

r/Anthropic· COMMUNITY

Opus 4.7 ignores skills but thinks it's a lawyer - how to transfer skills to ChatGPT?

To start with, I'm using Claude for years, and it's been a roller coaster, especially with the usage policy. I'm a lawyer and I wrote a **legal research skill**, instructing the model exactly what to verify and where. When I asked it a tax-related question, (which is also law, by the way) Opus 4.7 told me I should contact a tax expert because it's a lawyer (??) and not a tax expert. Then it answered my question anyway and basically made up even the basic stuff. Since I knew it was wrong, I asked whether it had verified this, and the model told me no, it just remembered the answer from i...

u/SeparateObligation81·2 months ago·15 pts / 9 comm

r/MachineLearning· COMMUNITY

Is it just me or is the Conference Lottery culture killing research? [D]

Reddit discussion on conference submission pressure and burnout in academic ML research culture.

u/SillyNeuron·2 months ago·34 pts / 11 comm

r/singularity· COMMUNITY

Grok 4.3 achieves higher overall intelligence over 4.20 with less of a cost, at the price of slightly higher hallucination rate.

Grok 4.3 shows improved performance over 4.20 with lower cost but higher hallucination rate.

u/Profanion·2 months ago·102 pts / 41 comm

← Front Page30 stories

← Newer Older →