The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

>Long-context inference in large language models is bottlenecked by the quadratic cost of full attention. Existing efficient alternatives often rely either on native sparse training or on heuristic token eviction, creating an undesirable trade-off among efficiency, training cost, and accuracy. In this work, we show that full-attention LLMs are already intrinsically sparse and can be transformed into highly sparse models with only minimal adaptation. Our approach is built on three observations: (1) only a small subset of attention heads truly requires full long-context processing; (2) long-...

u/pmttyji·27 days ago·43 pts / 16 comm

r/ClaudeAI· COMMUNITY

6 months of .md memory, conflicting facts are the hard part

Engineer shares 6-month experience building agent memory systems using markdown filesystem with warm/archive layers; identifies conflicting facts as core challenge.

u/Perfect_Tangerine432·27 days ago·27 pts / 24 comm

TechCrunch AI· PRESS

Startup Battlefield 200 applications close in days: Apply before May 27

The deadline to apply or nominate for Startup Battlefield 200 is May 27. This is your shot at VC access, global visibility, TechCrunch coverage, and $100,000. Apply now.

TechCrunch Events·27 days ago·+ covered by others

r/LocalLLaMA· COMMUNITY

MiniCPM5-1B

MiniCPM5-1B released on HuggingFace: 1B-parameter model from CPM team, likely competitive efficiency benchmark for edge deployment.

u/kevinlch·27 days ago·82 pts / 10 comm

r/LocalLLaMA· COMMUNITY

The Financial Times has published an article about Heretic

Financial Times reports Heretic tool removes guardrails from Meta's Llama 3.3 in <10 minutes; 3,500+ decensored variants downloaded 13M times.

u/-p-e-w-·27 days ago·81 pts / 10 comm

TechCrunch AI· PRESS

5 days left: Save up to $410 on TechCrunch Disrupt 2026 passes before prices increase

Early Bird savings for TechCrunch Disrupt 2026 in San Francisco end May 29 at 11:59 p.m. PT. Register now to save up to $410 before prices increase.

TechCrunch Events·27 days ago·+ covered by others

r/OpenAI· COMMUNITY

Humanity's greatest hits: things we actually paused

Reddit post discussing historical examples of technology pauses; unclear connection to AI governance or frontier capability development.

u/KeanuRave100·27 days ago·101 pts / 59 comm

r/ClaudeAI· COMMUNITY

I've been using Claude Code as a motion graphics engine for my YouTube videos. It writes the JSX, I render. Edit time roughly halved.

Found a really clean Claude Code use case that's not coding-coding. Remotion (React for video) means motion graphics are JSX components. So I describe what I want in plain English, Claude Code writes the component, I render. Lower thirds, intros, overlays, all reusable across videos. Iterations are seconds instead of the typical "drag clips around in CapCut for an hour" loop. Visual style is finally consistent across my channel because the components are shared. 13 min walkthrough on my channel, link in comments to avoid spam vibes.

u/Silver-Range-8108·27 days ago·31 pts / 6 comm

r/singularity· COMMUNITY

Demis: Solving erdos problems are far from true invention

Demis Hassabis argues that solving Erdős problems, while mathematically significant, falls short of demonstrating true scientific invention or reasoning.

u/Charuru·27 days ago·107 pts / 61 comm

r/ClaudeAI· COMMUNITY

I loved the idea behind "caveman" but didn't want a caveman. So I gave it a Kevin.

User shares a system prompt technique using Office quote to reduce Claude verbosity and context overhead.

u/TheTwistedTabby·27 days ago·20 pts / 23 comm

r/ClaudeAI· COMMUNITY

I stress-tested Kimi K2.6 against Claude Opus 4.7 on a quick coding-agent task

I tested Claude Opus 4.7 and Kimi K2.6 on the same coding agent task i.e. build an AI Fix Runner that takes a broken repo, runs its tests, identifies the failure, applies a patch, reruns the test, and exposes the final diff/logs through an API and UI. The goal was not to benchmark syntax completion or simple repo edits. I wanted to test model behavior on a less familiar integration path: shifting execution from local processes into remote sandboxes. I used Tensorlake specifically because the sandbox API is newer and integration-heavy. This made the test more about whether the model could re...

u/shricodev·27 days ago·22 pts / 11 comm

r/ClaudeAI· COMMUNITY

Are we nearly there?

Reddit speculation on AI company cash burn and market consolidation by 2027, predicting funding scarcity for non-GAFAM players.

u/irelatetolevin·27 days ago·72 pts / 19 comm·+ covered by others

r/LocalLLaMA· COMMUNITY

NuExtract3 released: open-weight 4B VLM for Markdown, OCR and structured extraction (self-hostable)

Numind releases NuExtract3, open-weight 4B multimodal VLM for document extraction and Markdown conversion under Apache-2.0.

u/Gailenstorm·27 days ago·49 pts / 12 comm

r/singularity· COMMUNITY

Anthropic beats OpenAI on business adoption

Link: ramp.com

u/JackFisherBooks·27 days ago·100 pts / 24 comm

r/LocalLLaMA· COMMUNITY

Old Mac Pro still proving its worth

Mac Pro user discovers legacy AMD D700 GPUs now support Vulkan-based LLM inference via new drivers, reviving dormant hardware.

u/Hephaestite·27 days ago·42 pts / 23 comm

r/singularity· COMMUNITY

LimX Dynamics launches Luna, its fluid, full-size humanoid robot

LimX Dynamics unveils Luna, a full-size humanoid robot with fluid motion capabilities, advancing embodied AI hardware.

u/Distinct-Question-16·27 days ago·112 pts / 43 comm

r/OpenAI· COMMUNITY

Figure AI had a livestream of their robots sorting packages 24/7 for 8 days straight. These aren't staged demos anymore.

Figure AI demonstrated robots performing continuous package sorting for 8 days in livestreamed operations, showing progress toward production-ready robotic systems.

u/EchoOfOppenheimer·27 days ago·53 pts / 59 comm

r/OpenAI· COMMUNITY

A chart showing how many unsolved math problems have recently been solved by AI

u/Confident_Salt_8108·27 days ago·50 pts / 11 comm·+ covered by others

r/ClaudeAI· COMMUNITY

Why can't Claude count, and how can I help it do so?

Reddit user reports Claude struggles with constrained output length despite feedback, seeking workarounds.

u/Caffe44·27 days ago·22 pts / 52 comm

r/LocalLLaMA· COMMUNITY

MiMo-V2.5-coder

MiMo-V2.5-coder released as open-weights coding model alternative to Qwen and DeepSeek for 128GB+ systems.

u/jedisct1·27 days ago·45 pts / 20 comm

r/LocalLLaMA· COMMUNITY

Next year we're getting 0.5T model from Grok

Elon Musk announces 0.5T parameter Grok model planned for next year, with open-weights release.

u/pmttyji·27 days ago·47 pts / 51 comm

r/OpenAI· COMMUNITY

Let me in... but make it SFW

Reddit post with unclear title; insufficient content for professional AI analysis.

u/KeanuRave100·27 days ago·102 pts / 11 comm

r/OpenAI· COMMUNITY

People should understand the basics of how AI works before using it at work

Reddit post warns that AI systems reflect user bias and prompting style, citing workplace example where team lead weaponized LLM criticism to validate negative performance opinions.

u/mrsforceX·27 days ago·52 pts / 23 comm

r/OpenAI· COMMUNITY

OpenAI is paying people in NYC to install 360-degree cameras in their homes that record everything. Vacuuming, washing dishes, cooking, etc.

OpenAI reportedly recruiting NYC residents to install home cameras capturing daily activities for data collection.

u/Confident_Salt_8108·27 days ago·125 pts / 35 comm

r/LocalLLaMA· COMMUNITY

server: fix checkpoints creation by jacekpoplawski · Pull Request #22929 · ggml-org/llama.cpp

llama.cpp PR addresses checkpoint creation inefficiency when context optimization tools modify conversation history in agentic workflows.

u/jacek2023·27 days ago·79 pts / 19 comm

r/LocalLLaMA· COMMUNITY

1000 tps generation on Qwen3.6 27B with V100s

User demonstrates 1000 tokens/sec generation throughput on Qwen 3.6 27B with V100 GPUs at high batch sizes.

u/Simple_Library_2700·27 days ago·125 pts / 18 comm

r/Anthropic· COMMUNITY

Safety filter flags now adults being treating like kids. Zero privacy?

User reports Claude's safety filters blocking routine health questions, citing poor UX and considering switching to competitors.

u/aka_blindhunter·27 days ago·36 pts / 33 comm

r/singularity· COMMUNITY

reconstructing different angles from live footage

4D Gaussian Splatting reconstructs 3D spatial data from 2D video footage; technique converts flat imagery into volumetric representations.

u/keemalexis·27 days ago·374 pts / 65 comm

r/ClaudeAI· COMMUNITY

I used Claude Code to build an iPhone app, Apple Watch app, and landing page… now it has 1,500+ users

Developer built iOS/watchOS app and landing page using Claude Code, reached 1,500+ users by solving friction in law enforcement location-sharing workflows.

u/alion94·27 days ago·52 pts / 14 comm

r/ClaudeAI· COMMUNITY

"I See An Old Node Process In the Background, Let Me Kill That For you"

Who knows this pain?

u/OrsakNarrative·27 days ago·44 pts / 5 comm

← Front Page30 stories

← Newer Older →