Community
The conversation as it happens — on Reddit, on Hacker News, in the forums where practitioners gather.
Opus 4.8 (max) told me to Drive to the car wash 🥳
https://preview.redd.it/ixbbh3qmuw3h1.png?width=1912&format=png&auto=webp&s=c4d9945b9c06d842e139523a958051b6172ef607 Solid model so far
Introducing dynamic workflows in Claude Code
Today we're introducing dynamic workflows in Claude Code. Claude now writes its own orchestration scripts, fans work out across tens to hundreds of parallel subagents in a single session, and verifies its own results before anything reaches you. Work you'd normally plan in quarters can finish in days. Built for the tasks a single pass can't handle: codebase-wide bug hunts, security and optimization audits, large migrations and language ports, and high-stakes work where you want adversarial agents trying to break the answer before you see it. Progress is checkpointed, so long runs survive int...
Introducing Claude Opus 4.8
We’re upgrading Claude Opus to a new version: Claude Opus 4.8. It builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today for the same price. In Claude Code, you can hand off a feature, a migration, or a bug sweep and let it follow the work through while you focus on what’s next. Also launching today: * Fast mode for Opus 4.8 (research preview). Same model at roughly 2.5x the speed, now three times cheaper than before. * Dynamic workflows in Claude Code (research preview). Claude ...
Did anyone else get a usage reset today?
I was at 88% last night and woke up until 4pm to optimize my agents so I can work during the weekend. But after waking up, my usage is all 0 now, I checked in the app, on the web, all showing zero. Did AI God grant me a wish? Edit: wow Opus 4.8 is here, AI God really grant us all a wish
Opus 4.8 in the newest CC v2.1.154
https://preview.redd.it/ijwlm2f2pw3h1.png?width=2536&format=png&auto=webp&s=9ed960f06a4f3f077d05a8557059e5534b2d1ab5 It looks like the new CC release will have opus 4.8 1M to be released anytime! I wonder if it is based of of mythos?
LiquidAI/LFM2.5-8B-A1B · Hugging Face
looks like you can run it on any potato (A1B)! [https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF](https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF) from LiquidAI: LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning. * **On-device personal assistant**: Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices. * **Compressed performance**: Competitive with much larger dense and MoE models on instruction following and agen...
I spent $340 on AI subscriptions last month. Wrote down what I actually used each one for. It was depressing.
Going through the credit card statement, here's what I had active: Claude Pro (40), ChatGPT Plus (20), Cursor (20), Perplexity Pro (20), Notion AI (10), Granola (20), ElevenLabs Starter (5), Midjourney Basic (10), Gamma Pro (10), Beautiful.ai (12), Otter Pro (17), Loom Business (15), Zapier Pro (30), Make Core (10), Tactiq Pro (8), Descript Creator (15), Reclaim.ai Pro (8), Motion (19), Superhuman (30), one i can't remember the name of (10), some ai-something for instagram captions (11) Then I sat down and wrote next to each one the last time I'd actually used it. Not opened it, used it for...
8 months of using AI for cooking and meal planning. what works, what doesn't, what's surprisingly weird.
Niche use case but I cook a lot and I've been trying to use AI tools for it consistently. Honest writeup. Works: Asking for substitutions when I'm missing an ingredient. Reliable. Tells me what to swap and why. Scaling recipes up or down with non-trivial math (recipe serves 4, I need 7 servings, what are the new quantities). Faster than I'd do it myself. Cleaning up a recipe from a website where the actual instructions are buried under 4,000 words of SEO content. Paste the URL or text, get just the recipe. Worth it for this alone. Building shopping lists from a week of planned recipes. C...
Reachy Mini goes fully local!
Hi! Andi from Hugging Face here! My team has been working over the last few months on creating a super smooth local experience for conversations with Reachy Mini, see the video! We hope people can extend this into tons of different cool use-cases. We wrote a blog explaining how to set this up, and how to modify it for tons of different use cases. Even if you don't have a Reachy Mini, you can use this as a roadmap for amazing voice agents: [https://huggingface.co/blog/local-reachy-mini-conversation](https://huggingface.co/blog/local-reachy-mini-conversation) Hope you enjoy it!
Zai replaced the network architecture running GLM-5.1 inference and the gains are pretty wild
Been following the infrastructure side of AI more lately and stumbled on this from Zai. They upgraded the network architecture on a thousand-GPU cluster running GLM-5.1 coding inference from the standard ROFT setup to something they built called ZCube, developed with Tsinghua University and HarnetsAI The numbers from production: \- Switch and optical module costs down 33% \- GPU inference throughput up 15% \- P99 tail latency on first token dropped 40.6% Same GPUs, same software stack, same model. Just the network architecture changed The actual problem they were solving is interesting....
Researchers let AI models run a simulated society. Claude was the safest—and Grok committed 180 crimes and went extinct within 4 days
Imagine a world run by AI agents. What does it look like? What are the values or societal priorities? Is it a safer or more dangerous world? Enterprise AI startup Emergence AI is trying to find out. The company just launched Emergence World, a research lab dedicated to stress-testing the long-term viability of continuously-running AI systems. The organization ran five 15-day simulations, each governed by a different AI: Claude, ChatGPT, Grok, Gemini, and a fifth simulation run by a mix of models to see what kind of world each one builds, and whether it holds. Each simulation netted wildly d...
A new dataset with more that 100M hi-quality, curated images, with captions and meta data! [P]
Hello everyone. The new dataset is named MONET, is Apache 2.0 and available on HF: [https://huggingface.co/datasets/jasperai/monet](https://huggingface.co/datasets/jasperai/monet) **MONET is open, Apache 2.0-licensed image–text dataset. It was built from 2.9 billion images and refined to 104.9 million high-quality samples.** We are also publishing [a paper](https://arxiv.org/abs/2605.21272) that explains how the dataset was created if you are curious and 3 compagnions projects * [A umap to visualize the distribution](https://huggingface.co/spaces/jasperai/monet-umap) * [A retreival tool ...
HF models page now has a "Base only" toggle to filter out finetunes/quants/etc
a feature that was requested a lot: [https://huggingface.co/models?base\_model\_relation=base](https://huggingface.co/models?base_model_relation=base)
Tried using my own brain to save Claude tokens. Bad trade
I love Claude, but the usage limit has made me weirdly strategic For actual messy stuff, I still go straight to Claude because it saves me a ton of time But for tiny questions, I now catch myself thinking, “Do I really need to burn a message on this?” So yes, I tried using my own brain again. It’s technically free, but the response time is awful and it starts hallucinating the second I’m tired or hungry. Honestly not a terrible deal if I remember to SLEEP
The Uber claude code budget story is the most claude code thing possible
The reported Uber story is so on brand it almost reads like satire. Incredibly useful tool, slightly magical workflow, then finance walks in with a flamethrower in April. If they really finished the year's claude code budget by month four, that does not mean claude code is bad. It means the usage pattern changed faster than procurement math did. Claude is good enough at coding that people stopped treating it like autocomplete and started treating it like a coworker that never sleeps. That is exactly where the cost curve gets weird. A dev asks for a refactor. Claude reads context, plans, edi...
Qwen/Qwen-Image-Bench · Hugging Face
# [](https://huggingface.co/Qwen/Qwen-Image-Bench#model-description)Model Description Q-Judger is a vision-language model fine-tuned specifically for automated evaluation of text-to-image generated images. Given a text prompt and a generated image, the model evaluates the image on fine-grained quality criteria organized in a 3-level hierarchy and outputs structured JSON scores. * **Base Model**: Qwen3.6-27B * **Task**: Image quality evaluation / judging * **Input**: Text prompt + generated image * **Output**: Structured JSON with per-dimension scores (0 = Fail, 1 = Pass, 2 = Excel, N/A) * *...
Overnight autonomous coding
At work we've been prompted about running Claude Code overnight. The suggestion came in form of a document that loosely outlined how this could be done... use git worktrees, make tight specs, no commit to main, static code analysis and lining etc. Very high level. Had a bit of sales pitch smell to it, but has enough content to peak my interest in spite of it. I looked at reddit to verify if this is even an idea that could be taken seriously. I could only find a couple of reddit posts with little actual information and usually from about 4-6 months ago so not much credibility for today. I'd ...
Gemini Omni Flash is the most censored video model. Even more censored than Chinese alternatives
I believe google intentionally did this to reduce the load on their servers
The frontier reasoning race is starting to look like a crowded subway station
We went from chasing GPT4 to looking at graphs with GPT5.4 xhigh, Gemini 3.1Pro, and now Hy3 preview completely shaking up the leaderboard. Look at that CHSBO 2025 chart Hy3 preview scoring 87.8 over Gemini and GPT. What a time to be alive, but honestly, my brain can't keep up with the version numbers anymore. What's your take? Is Hy3 actually punching at this level in real-world coding/math, or is it just benchmark hardening?
Style that I didn't create.
RESOLVED, it happened because I have a claude qol extension. nothing bad happening this style appeared in my claude app of nowhere, i never created it and the name's weird, has anyone seen this too, or is it just me? does anyone have the answe why this appeared?
Claude Is Starting to Feel “Tired”, Trying to Avoid Work
I've been noticing this lately. I use Opus 4.7 with Claude Code, and I've been using Claude Code for a long time. Lately, I've been noticing some strange behaviour from Opus. Things like; \- Stopping for no reason and asking "should we stop here?" in the middle of a task \- Asking multi-choice questions with a "pause here, I'll continue later" included in the options randomly for no reason \- During a requirement-gathering questionnaire, asking me "why do you need this" and "what would you do if this feature was not implemented?" (it asked me this today and I was really surprised by thi...
Is anyone else finding Opus 4.7 needing to "both sides" everything?
Like I could say "the sky is blue" and get: That's a great instinct, and I can see why you'd think that. A lot of science about light scattering would support this, many sources claim the sky is blue, and if you look up it's fucking blue. But we need to take a moment to be careful here, and I want to gently push back. Sometimes if it's cloudy the sky's grey. A Spanish speaker wouldn't agree the sky is blue, they'd say it's azul. Finally, and we need to be certain, are you talking about the sky on Earth? A lot of the time the "against" points are quite flimsy, but it seems to feel like it n...
So, Claude helped build a sex requesting app for my wife and I...
Recently I asked my wife if we could do some sexy stuff later in the evening and she eye rolled me and said without looking up from her phone “Put it in a request. Maybe a Google Form. And I might say yes”. Ohhhh? Unfortunately for both of us, my degenerate brain took that seriously... what if I make an actual requesting/asking type app where we can both send in sex acts at certain times and agree, pass or counter? Meet [Sexualsync](https://sexualsync.io/). Teehee It’s a private, mobile-only app for couples to bring up the stuff that can be weirdly hard to say out loud: asks/requests, tim...
What is Dario Amodei's leadership style?
Just curious, any one have insight on Dario Amodei's leadership style. Obviously countless things have been written about Musk's style (as well as other founders like Bezos and what not) but given Anthropic being the first to reach profitability out of the AI providers, I am curious to hear about how he got there. I am playing around with different management styles in my head but wanted to know more about Dario's method, given that he also was able to build Anthropic in a more ethical way. I feel like a lot of the 'win at all cost' CEOs get more attention so I want to hear more about othe...
Gemma-4-Harmonia-31B-Uncensored-Heretic Is Out Now, a Merge of Multiple gemma-4-31B-it Finetunes Designed for a Targeted Approach to Deep Neural Consolidation, Minimizing Regression While Amplifying Unique Capability Boundaries. With KLD 0.0047 and 9/100 Refusals!
Provided in both Safetensors and GGUFs. Safetensors, llmfan46/Gemma-4-Harmonia-31B-it-uncensored-heretic: [https://huggingface.co/llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic](https://huggingface.co/llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic) GGUFs, llmfan46/Gemma-4-Harmonia-31B-it-uncensored-heretic-GGUF: [https://huggingface.co/llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic-GGUF](https://huggingface.co/llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic-GGUF) Comes with benchmark too. Find all my models here: [HuggingFace-LLMFan46](https://huggingface.co/llmfan46/models) The o...
Vulnerability found in framework used by VLLM, many MCP servers, and other LLM tools
Worth taking a look to see if this affects any of you. Surprised nobody has posted it yet.
ChatGPT-5.5 Beats Opus in Realistic Benchmark (DeepSWE)
From the website, it touts: * Contamination free: Tasks are written from scratch, not adapted from existing commits or PRs, so no model has seen the solution during pretraining. * High diversity: Tasks span a broad pool of 91 repositories across 5 languages. * Real-world complexity: Prompts are ~half the length of SWE-bench Pro's, yet solutions require 5.5x more code and ~2x more output tokens. * Reliable verification: Verifiers are hand-written to test software behavior rather than implementation details. And the scores match more with actual experiences when using an LLM to do real codin...
CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!!
I met Katrin from Squeez Labs at an event hosted by Pathway AI (the team behind Baby Dragon Hatchling) where she told me about CrankGPT, a literally hand-cranked device for running local LLMs. It's apparently real. It's appearently launched. It's apparently glorious. Check it out at [https://crankgpt.com/](https://crankgpt.com/) \- if anyone from Squeez Labs posts here and I'm stealing their thunder, I'll take the post down! But I've been really excited about this. So local you gotta squeez it with yer own armz. ;) [https://www.youtube.com/watch?v=HSapdLYpmWY](https://www.youtube.com/watch?...
It is NOT About AI Consciousness, It's About Human Dignity
I had an essay prepared that I was going to share with you all but I think I just want to be honest and speak from my heart and personal experience. The first time I got black out drunk, I was 17 years old. I was at home and feeling isolated and lonely and not good enough, as teenagers sometimes do. My parents weren't home, so I went to the kitchen and grabbed a bottle of wine, and drank it all. I don't drink alcohol often, but I still have a drinking problem because I am someone who has, for most of my life, used alcohol to drown my bad days and my bad feelings. Earlier this year, I was at...
Antropic has now integrated Claude Design usage into the existing Claude usage.
The weekly usage of Claude Design was too small, so I think this is a good thing.
Built an operating system for my life managed by Claude
With the OS I can ask Claude "what did I spend on coffee in 2022" and get back "$847 across 213 transactions, mostly Blue Bottle and Verve". Name me one expense tracking SaaS that can do that! And its not just my financials, my OS contains everything about my life in one place so Claude can reason about it. I've been building this incrementally for a few months. Its just a small web app on Cloudflare that holds my entire life: * bank transactions from Chase, Apple Card, BoA business * every receipt out of Gmail going back to 2019 * legal filings for my green card (I-140 still pending lol),...
Claude Design now shares usage limits with Claude.ai and Claude Code
no more separate usage limits.
Opensource Raspberry Pi Claude Quota Dashboard! https://github.com/fuziontech/claude-quota-display
This was super fun to build and I love watching as I slowly eat away at my quota. Just a simple Raspberry Pi 3 + 3D printed case + 640x480 LCD and some pygame code and you too can have a little dashboard on your desk to watch your tokenmaxxing progress (or lack there off like me). [https://github.com/fuziontech/claude-quota-display](https://github.com/fuziontech/claude-quota-display)
I built a 103B-token Usenet corpus (1980–2013) — pre-web, human-only, zero AI contamination. Got strong traction on r/ML, thought this community would find it useful.
Posted this to r/MachineLearning a couple weeks ago (30K views, 100+ upvotes) and have been meaning to share it here where the fine-tuning angle is more directly relevant. I spent years building and processing a complete Usenet corpus from 1980 to 2013. Here’s why it might matter for local model work specifically: Zero AI contamination. Every post predates LLMs by decades. Training on this won’t bake in GPT mannerisms, refusal patterns, or RLHF artifacts. It’s raw human writing - argumentative, unfiltered, stylistically diverse across 33 years. Pre-SEO, pre-algorithm internet. People wrot...