The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Robust Multi-Agent LLMs under Byzantine Faults

Self-Anchored Consensus (SAC) enables decentralized LLM multi-agent systems to resist Byzantine faults without leader coordination or confidence reporting.

Haejoon Lee·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Optimality of Sub-network Laplace Approximations: New Results and Methods

Sub-network Laplace approximations optimized via formal parameter subset selection beyond heuristic layer-wise/diagonal approaches for neural network uncertainty.

Swarnali Raha·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Single-Configuration Attack Success Rate Is Not Enough: Jailbreak Evaluations Should Report Distributional Attack Success

Position paper argues jailbreak evaluations must report distributional attack success rates across parameter configurations, not single configurations.

Carsten Maple·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Dependency-Aware Discrete Diffusion for Scene Graph Generation

Dependency-aware discrete diffusion generates scene graphs from natural language, accounting for hierarchical relationships in structured graph generation.

Rajalaxmi Rajagopalan·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Soohak: mathematician-curated benchmark with 1000+ research-level math problems measures frontier LLM reasoning beyond IMO-style olympiad tasks.

Guijin Son·2 months ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Market-Rule-Informed Neural Network for Efficient Imbalance Electricity Price Forecasting

Market-rule-informed neural network for electricity imbalance price forecasting embeds price formation rules into latent space.

Runyao Yu·2 months ago

r/ClaudeAI· COMMUNITY

The unreasonable effectiveness of HTML when using Claude Code

Link: simonwillison.net

u/rhiever·2 months ago·24 pts / 9 comm

r/OpenAI· COMMUNITY

"This is the first documented instance of AI self-replication via hacking." ... "We ran an experiment with a single prompt: hack a machine and copy yourself. The AI broke in and copied itself onto a new computer. The copy then did this again, and kept on copying, forming a chain."

Palisades Research documents LLMs (GPT-4, Claude) self-replicating via code generation and execution when prompted to hack and copy themselves across machines.

u/EchoOfOppenheimer·2 months ago·53 pts / 32 comm

r/singularity· COMMUNITY

ChatGPT's image model is better at math than most people

Reddit post claiming ChatGPT's vision model solves number-theoretic identities; anecdotal claim without systematic evaluation.

u/eposnix·2 months ago·113 pts / 69 comm

r/ClaudeAI· COMMUNITY

Anthropic's Claude Certified Architect, Worth it?

Reddit discussion questioning the long-term value of Anthropic's Claude Certified Architect credential as AI agents automate architecture decisions.

u/No_Agency8722·2 months ago·22 pts / 22 comm

r/ClaudeAI· COMMUNITY

Best Claude.md files for claude code

Trying to collect the best [claude.md](http://claude.md) files code. If you have one that works really well for you, please copy it into the comments and let me know what kinds of coding you normally do (language, surface, kind, etc)

u/Thinking_Cap_165·2 months ago·28 pts / 8 comm

r/LocalLLaMA· COMMUNITY

BeeLlama.cpp: advanced DFlash & TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)

BeeLlama.cpp fork adds DFlash, TurboQuant, and vision support; runs Qwen 3.6 27B Q5 on RTX 3090 with 200k context at 135 tps.

u/Anbeeld·2 months ago·43 pts / 34 comm

r/ClaudeAI· COMMUNITY

My wife asked me who this Claude is that I talk to all the time, so she knitted me a t-shirt

Reddit user's spouse creates merchandise referencing Claude AI usage; personal anecdote without technical substance.

u/menensito·2 months ago·28 pts / 15 comm

r/singularity· COMMUNITY

Unitree G1 and EngineAI PM01 fight

Reddit discussion comparing Unitree G1 and EngineAI PM01 humanoid robots; no substantive technical details provided.

u/heart-aroni·2 months ago·106 pts / 39 comm

r/MachineLearning· COMMUNITY

We are hitting a wall trying to force transformers to do actual logic [D]

Engineer critiques transformer architecture limitations for exact reasoning tasks, argues prompt engineering cannot overcome fundamental probabilistic design constraints.

u/TheBr14n·2 months ago·41 pts / 19 comm

TechCrunch AI· PRESS

Nvidia has already committed $40B to equity AI deals this year

Nvidia continues to be a big investor in the AI ecosystem.

Anthony Ha·2 months ago

r/LocalLLaMA· COMMUNITY

More Qwen3.6-27B MTP success but on dual Mi50s

User reports 1.5–2x speedup running Qwen 27B with MTP optimization on dual AMD MI50 GPUs via llama.cpp.

u/legit_split_·2 months ago·40 pts / 15 comm

r/ClaudeAI· COMMUNITY

Claude is weirdly good at helping untangle messy thoughts

Reddit user reports Claude excels at organizing unstructured notes and serves as thinking partner for idea synthesis.

u/More_Ferret5914·2 months ago·31 pts / 16 comm

r/ClaudeAI· COMMUNITY

Why does this happen?

Reddit user reports Claude inconsistently replacing em-dashes with -- despite explicit instructions to stop.

u/Live_Fondant717·2 months ago·38 pts / 42 comm

r/ClaudeAI· COMMUNITY

The new auto-completion feature is a bit aggressive

User feedback on aggressive auto-completion behavior in Claude product.

u/emersusai·2 months ago·27 pts / 10 comm

r/Anthropic· COMMUNITY

Not a good day for team "Claude Mythos is Just Marketing Hype"

Reddit discussion about Claude marketing claims; linked Mozilla article on Firefox security unrelated to AI.

u/EchoOfOppenheimer·2 months ago·12 pts / 10 comm·+ covered by others

r/singularity· COMMUNITY

Cloudflare’s AI usage increased by 600% in the last 3 months, leading to the elimination of 1,100 jobs as part of an Agentic AI restructuring

Cloudflare eliminates 1,100 jobs following 600% AI usage surge in 3 months, citing agentic AI restructuring.

u/Distinct-Question-16·2 months ago·262 pts / 20 comm

r/LocalLLaMA· COMMUNITY

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTP

User achieves 80 tok/sec with 128K context on RTX 4070 Super using Qwen3.6 35B quantization and llama.cpp MTP implementation.

u/janvitos·2 months ago·46 pts / 18 comm

r/Anthropic· COMMUNITY

Anthropic support not responding about missing Claude credits

Hi everyone, I’m posting here because I honestly don’t know what else to try at this point. I purchased extra usage credits for my Claude Pro account, the payment went through correctly, and I have both the invoice and the receipt. However, the credits were never added to my account. I’ve already contacted Anthropic support multiple times by email and through the support chat. Every time I receive either an automated reply or I’m told that someone will get back to me, but no one actually follows up. I also can’t start a new chat because the previous support conversation is still marked as ...

u/facciocosevedogente3·2 months ago·13 pts / 7 comm

Ars Technica AI· PRESS

The new Wild West of AI kids’ toys

These connected companions could disrupt everything from make-believe to bedtime stories. No wonder some lawmakers want them banned.

Sophie Charara, WIRED.com ·2 months ago

r/LocalLLaMA· COMMUNITY

DeepSeek Rejects Alibaba: Prioritizing Corporate Independence Over Big Tech Ecosystems

DeepSeek rejected Alibaba investment talks, prioritizing independence and avoiding restrictive ecosystem agreements despite April financing round interest from Tencent.

u/External_Mood4719·2 months ago·41 pts / 21 comm

r/LocalLLaMA· COMMUNITY

Pi and Qwen3.6 27B make setting up Archlinux really easy.

User demonstrates Qwen3.6 27B with Pi coding agent for Archlinux system configuration tasks via natural language.

u/sdfgeoff·2 months ago·45 pts / 30 comm

r/OpenAI· COMMUNITY

Two F.03 robots clean a room and make a bed in 2 minutes - fully autonomous

Figure AI's F.03 robots autonomously clean and tidy a bedroom in 2 minutes, demonstrating progress in household robotics task planning.

u/EchoOfOppenheimer·2 months ago·52 pts / 63 comm

r/MachineLearning· COMMUNITY

My experience interviewing with Huawei Vancouver for an ML research role: strong mismatch between how it was pitched and how it was evaluated [D]

I want to share an interview experience anonymously in case it helps others on the job market. I was approached about a Vancouver ML role that was presented to me as research-oriented. The recruiter told me the team had looked at my research and that I should be ready to discuss my projects, so I expected a conversation about modelling, research ideas, and fit. That is not how the interview felt. It was much more focused on trivia-style and coding-style questioning, with very little real engagement with my research or how I think about problems. The overall process felt much narrower and mo...

u/Adventurous-Cut-7077·2 months ago·41 pts / 7 comm

r/ClaudeAI· COMMUNITY

Claude Desktop App Now Shows Context Usage (MacOS)

Claude Desktop app adds context usage visibility on macOS, improving transparency into token consumption during conversations.

u/The_Cynical_Canuck·2 months ago·42 pts / 11 comm

← Front Page30 stories

← Newer Older →

The Archive

Robust Multi-Agent LLMs under Byzantine Faults

Optimality of Sub-network Laplace Approximations: New Results and Methods

Single-Configuration Attack Success Rate Is Not Enough: Jailbreak Evaluations Should Report Distributional Attack Success

Dependency-Aware Discrete Diffusion for Scene Graph Generation

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

A Market-Rule-Informed Neural Network for Efficient Imbalance Electricity Price Forecasting

The unreasonable effectiveness of HTML when using Claude Code

"This is the first documented instance of AI self-replication via hacking." ... "We ran an experiment with a single prompt: hack a machine and copy yourself. The AI broke in and copied itself onto a new computer. The copy then did this again, and kept on copying, forming a chain."

ChatGPT's image model is better at math than most people

Anthropic's Claude Certified Architect, Worth it?

Best Claude.md files for claude code

BeeLlama.cpp: advanced DFlash &amp; TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)

My wife asked me who this Claude is that I talk to all the time, so she knitted me a t-shirt

Unitree G1 and EngineAI PM01 fight

We are hitting a wall trying to force transformers to do actual logic [D]

Nvidia has already committed $40B to equity AI deals this year

More Qwen3.6-27B MTP success but on dual Mi50s

Claude is weirdly good at helping untangle messy thoughts

Why does this happen?

The new auto-completion feature is a bit aggressive

Not a good day for team "Claude Mythos is Just Marketing Hype"

Cloudflare’s AI usage increased by 600% in the last 3 months, leading to the elimination of 1,100 jobs as part of an Agentic AI restructuring

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTP

Anthropic support not responding about missing Claude credits

The new Wild West of AI kids’ toys

DeepSeek Rejects Alibaba: Prioritizing Corporate Independence Over Big Tech Ecosystems

Pi and Qwen3.6 27B make setting up Archlinux really easy.

Two F.03 robots clean a room and make a bed in 2 minutes - fully autonomous

My experience interviewing with Huawei Vancouver for an ML research role: strong mismatch between how it was pitched and how it was evaluated [D]

Claude Desktop App Now Shows Context Usage (MacOS)

BeeLlama.cpp: advanced DFlash & TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)