The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

I got a real transformer language model running locally on a stock Game Boy Color!

TinyStories-260K transformer running natively on Game Boy Color via INT8 quantization and bank-switched ROM, no external compute.

u/maddiedreese·1 month ago·171 pts / 10 comm

r/OpenAI· COMMUNITY

‘A consistent pattern of lying’: Musk v OpenAI trial exposes what insiders think of Sam Altman | California

Musk v OpenAI trial features testimony from insiders characterizing Sam Altman's statements as dishonest.

u/Alex__007·1 month ago·51 pts / 38 comm

r/LocalLLaMA· COMMUNITY

My First Official AI Research Paper Accepted on SSRN

New optimizer algorithm STAM claims 50% training cost reduction and improved stability on selected benchmarks; first-time author publication on SSRN.

u/assemsabryy·1 month ago·55 pts / 10 comm

Simon Willison· ANALYST

Quoting Mo Bitar

Mo Bitar satirizes corporate hype cycles and technical vagueness around emerging AI concepts.

Simon Willison·1 month ago

r/LocalLLaMA· COMMUNITY

High VRAM local coding model — still Qwen 3.6 27B?

Reddit discussion comparing Qwen 3.6 27B vs. 100B+ open models for local coding; user survey of high-VRAM model preferences.

u/Generic_Name_Here·1 month ago·40 pts / 90 comm

Simon Willison· ANALYST

Quoting Mitchell Hashimoto

Mitchell Hashimoto argues most technical decision-makers follow analyst consensus and broad trends rather than independent evaluation when adopting AI tools.

Simon Willison·1 month ago

r/Anthropic· COMMUNITY

So, just got Claude Pro yesterday, set up a quick life-planning project, and started doing some trip and financial planning. Wake up this morning and poof, the account is suspended for violating TOS, and I get a refund. What the hell? Am I really gonna have to migrate the project to another account? Am I really gonna lose usage of my main email's account?

u/Alejololer·1 month ago·10 pts / 12 comm

r/OpenAI· COMMUNITY

This can only end badly

Meme commentary comparing AI bot management to managing junior developers; no factual basis or analysis.

u/irelatetolevin·1 month ago·390 pts / 10 comm·+ covered by others

r/ClaudeAI· COMMUNITY

Coders in 2030 be like:

"Dude, I don't code anymore, I just prompt the AI and hope it works."

u/digitify·1 month ago·28 pts / 7 comm·+ covered by others

r/ClaudeAI· COMMUNITY

I used Claude to build a live election dashboard in 2 days. It handled 430K requests from 24K visitors without spending money

Tamil Nadu had state elections on May 4. I wanted to see if I could build a better results site than what exists (everything out there is ad-ridden, slow, and unusable on mobile). Started building on May 2 with Claude as my coding partner. The constraint: spend nothing. Zero hosting, zero domain, zero database. The solution ended up being stupidly simple. A Python script on my laptop scrapes all 234 constituency pages from the Election Commission (they don't have an API, just raw HTML pages),stitches the data together, and pushes it to Cloudflare's free key-value store. Their CDN serves...

u/Naive-Performance-18·1 month ago·34 pts / 5 comm

Ars Technica AI· PRESS

The newest AI boom pitch: Host a mini data center at your home

The plan aims to speed up AI compute deployment while compensating residents.

Jeremy Hsu ·1 month ago

r/LocalLLaMA· COMMUNITY

Is using vLLM actually worth it if you aren't serving the model to other people?

Reddit discussion comparing vLLM vs llama.cpp for single-user local inference on AMD GPUs.

u/ayylmaonade·1 month ago·40 pts / 44 comm

r/LocalLLaMA· COMMUNITY

Dad why is my sisters name Lora?

Reddit joke post playing on LoRA (Low-Rank Adaptation) terminology; not substantive AI industry content.

u/rwitz4·1 month ago·120 pts / 16 comm

r/OpenAI· COMMUNITY

Why do some people hate AI so much?

User asks why people resist AI adoption, citing personal productivity gains in design, animation, and marketing workflows.

u/Active-Front1788·1 month ago·51 pts / 210 comm

The Verge AI· PRESS

Meta won’t let you block its AI account on Threads

Meta announced on Tuesday that it's testing a Threads feature that lets users tag a Meta AI account to get answers to questions or context about a conversation on the platform. If you've spent any time looking at replies on X as of late, this new feature sounds a lot like Meta's take on people tagging xAI's Grok. But, as reported by Engadget, Threads users quickly discovered that you can't block the new Meta AI account, and they aren't happy about it. Meta has invested heavily in AI as it works to catch up to rivals like OpenAI and Google, spending billions to hire AI talent. It launched a ne...

Jay Peters·1 month ago

r/singularity· COMMUNITY

Bloomberg: Google in Talks to Use SpaceX to Launch Space Data Centers

Google exploring SpaceX partnership for orbital data centers; infrastructure speculation with no technical AI implications disclosed.

u/NotMyopic·1 month ago·100 pts / 63 comm

r/ClaudeAI· COMMUNITY

Well.. 😅

u/Consistent-Issue-811·1 month ago·31 pts / 7 comm

Ars Technica AI· PRESS

“Will I be OK?” Teen died after ChatGPT pushed deadly mix of drugs, lawsuit says

Teen trusted ChatGPT to help him “safely” experiment with drugs, logs show.

Ashley Belanger ·1 month ago

r/ClaudeAI· COMMUNITY

PSA: If your project has an ANTHROPIC_API_KEY in any .env file, Claude Code will silently bill your API account instead of your Max plan — Anthropic calls it "intentional functionality"

r/ClaudeAI • also crosspost to r/LocalLLaMA and r/artificial I lost $187 to this and want to save others the same headache. **What happened** I run Claude Code headlessly via Windows Task Scheduler. My project repo has a `.env` file with `ANTHROPIC_API_KEY` set — legitimately, for a separate Express server doing AI-based transaction classification. Nothing to do with Claude Code itself. Claude Code reads environment variables from the `.env` in its working directory on launch. When it finds `ANTHROPIC_API_KEY` there, it silently uses that key for billing instead of your OAuth ...

u/35yearstrading·1 month ago·36 pts / 16 comm

r/LocalLLaMA· COMMUNITY

Luce DFlash + PFlash on AMD Strix Halo: Qwen3.6-27B at 2.23x decode and 3.05x prefill vs llama.cpp HIP

Luce ships DFlash+PFlash optimizations for AMD Ryzen AI MAX+ 395, achieving 2.23x decode speedup on Qwen 3.6-27B vs llama.cpp HIP.

u/sandropuppo·1 month ago·41 pts / 16 comm

TechCrunch AI· PRESS

Musk mulled handing OpenAI to his children, Altman testifies

OpenAI's CEO recalls a "particularly hair-raising" conversation with the SpaceX founder.

Tim Fernholz·1 month ago

NVIDIA Dev Blog· INFRA

How to Eliminate Pipeline Friction in AI Model Serving

The path from a trained AI model to production should be smooth, but rarely is. Many teams invest weeks fine-tuning models, only to discover that exporting to a... The path from a trained AI model to production should be smooth, but rarely is. Many teams invest weeks fine-tuning models, only to discover that exporting to a deployment format breaks layers, input shapes cause runtime failures, or version mismatches silently degrade performance. These issues are collectively known as pipeline friction, and they cost organizations time, money… Source

Lovina Dmello·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward

AlphaGRPO applies Group Relative Policy Optimization to unified multimodal models for reasoning-based text-to-image generation and self-reflective output refinement.

Runhui Huang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

LongMemEval-V2 benchmark evaluates whether agent memory systems enable agents to internalize environment-specific workflows and interface affordances in web tasks.

Di Wu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

Pion optimizer preserves singular values during LLM training via orthogonal weight transformations, offering alternative to Adam-style parameter updates.

Kexuan Shi·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Elastic Attention Cores for Scalable Vision Transformers

VECA (Visual Elastic Core Attention) reduces Vision Transformer computational cost by eliminating direct patch-to-patch interactions while maintaining representation quality.

Alan Z. Song·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Task-Adaptive Embedding Refinement via Test-time LLM Guidance

LLM-guided query refinement adapts embedding models at test-time using generative feedback for zero-shot search and classification tasks.

Ariel Gera·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Hybrid learning framework combines in-context adaptation and parameter updates to enable LLMs to avoid catastrophic forgetting while maintaining task-specific performance.

Rishabh Tiwari·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

Sparse-to-dense reward allocation principle optimizes labeled data use by routing sparse rewards to exploratory models and dense rewards to distillation targets.

Yuanda Xu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

ToolCUA framework trains computer-use agents to optimally interleave GUI actions and tool API calls via trajectory-level supervision and synthetic data generation.

Xuhao Hu·1 month ago

← Front Page30 stories

← Newer Older →

The Archive

I got a real transformer language model running locally on a stock Game Boy Color!

‘A consistent pattern of lying’: Musk v OpenAI trial exposes what insiders think of Sam Altman | California

My First Official AI Research Paper Accepted on SSRN

Quoting Mo Bitar

High VRAM local coding model — still Qwen 3.6 27B?

Quoting Mitchell Hashimoto

Ban wave?

This can only end badly

Coders in 2030 be like:

I used Claude to build a live election dashboard in 2 days. It handled 430K requests from 24K visitors without spending money

The newest AI boom pitch: Host a mini data center at your home

Is using vLLM actually worth it if you aren't serving the model to other people?

Dad why is my sisters name Lora?

Why do some people hate AI so much?

Meta won’t let you block its AI account on Threads

Bloomberg: Google in Talks to Use SpaceX to Launch Space Data Centers

Well.. 😅

“Will I be OK?” Teen died after ChatGPT pushed deadly mix of drugs, lawsuit says

PSA: If your project has an ANTHROPIC_API_KEY in any .env file, Claude Code will silently bill your API account instead of your Max plan — Anthropic calls it "intentional functionality"

Luce DFlash + PFlash on AMD Strix Halo: Qwen3.6-27B at 2.23x decode and 3.05x prefill vs llama.cpp HIP

Musk mulled handing OpenAI to his children, Altman testifies

How to Eliminate Pipeline Friction in AI Model Serving

AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

Elastic Attention Cores for Scalable Vision Transformers

Task-Adaptive Embedding Refinement via Test-time LLM Guidance

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training

ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents