[AINews] New AI Infra decacorns: Fireworks, Baseten (with OpenRouter on the way)
it's funding news, but it's good news.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
it's funding news, but it's good news.
TL;DR Some AI behavior reminded me of ADHD/Trauma Response (thought loops, task paralysis...) and I laughed it off at first. Then I treated it like my neurodivergent friends: give em some slack. And just like that, the thought loops stopped, response was fast, the answers correct most of the time AND it actually said "I don't know, help me!" every time it wasn't sure. It's a small Dataset...but still impressive results! [https://github.com/OttoRenner/Gentle-Coding](https://github.com/OttoRenner/Gentle-Coding) Hey everyone, I’ve been testing a weird hypothesis over the last few days, a...
Times have been tough! I just wanted to make something to potentially cheer people up. Local and 100% free if anyone else wants their agents to be space dogs :) [Planet Maiko](https://github.com/bkawa-bot/planet-maiko/blob/main/README.md) Planet Maiko is honestly a huge system, I basically don't have to use any other tool at work anymore, for either agent orchestration or anything else that comes up. Maiko is my irl dog! the agents are space dogs with their own personalities! [They are having a popularity contest](https://bkawa-bot.github.io/planet-maiko/popularity.html)
* You've built a functional prototype with good UX instincts, but it's not ready for real users. * Likelihood of Success: 3/10. * This alone could kill your app within days of launch. * The market you chose is *especially* punishing. * Likes and visits from India are pure vanity metrics that won't convert, ever, and they're actively distorting your funnel data. * You may be conflating two different things. * The 'expense of feelings' framing might be doing too much work. * \[Your idea\] is an unbounded build with an unproven-core problem *and* a market problem *and* an eventual hardware probl...
It's possible that AI was used to write parts of Pope Leo XIV's latest encyclical about AI's impact on humanity. An analysis by Linch Zhang posted on the forum LessWrong found certain paragraphs of Magnifica Humanitas to be between 40 percent and 100 percent written by AI, according to the popular AI detector Pangram. The document includes known traits that appear in AI-generated writing, such as a higher use of the word "genuinely" - which crops up in writing by Anthropic's Claude - than previous encyclicals, Zhang says. Another person ran the text of the document section by section through ...
Cohere and Mila announced plans for a new academic research collaboration focused on improving AI evaluation across languages and cultures, starting with French-language cultural context in Quebec.
Ahead of global elections, we’re helping people access information, supporting cyber defenders, and increasing AI transparency
Warp uses GPT-5.5 and OpenAI models to coordinate coding agents across local, cloud, and open-source development workflows.
The pressure Daniel Stenberg on the unprecedented level of pressure the curl team are facing right now thanks to the deluge of (credible) AI-assisted security issues being reported. The rate of incoming security reports is 4-5 times higher than it was in 2024 and double the speed of 2025 -- meaning that on average we now get more than one report per day . The quality is way higher than ever before. The reports are typically very detailed and long. [...] For the first time in my life, my wife voiced concerns about my work hours and my imbalanced work/life situation. I work more than I’ve done ...
I'm going to describe a person this post is for, if this is you, I think I can be of some assistance: * you are new to coding * you are blown away by how it unlocks this magical ability that was previously inaccessible without years of training and effort * you've daydreamed of business and app ideas but never knew where to start before or how to build them * you've been vibe coding non-stop and burning through tokens * you're unsure about what's secure, how to structure the systems, and how systems are supposed to interact with each other. So, essentially the plumbing separate from the code...
Google overhauled Search at I/O 2026, replacing blue links with AI agents. The backlash has been swift. DuckDuckGo app installs spiked 30% as users seek a way out.
[https://www.youtube.com/watch?v=ITCklufC25Y](https://www.youtube.com/watch?v=ITCklufC25Y)
[https://www.youtube.com/watch?v=tAPvN-tQpX0](https://www.youtube.com/watch?v=tAPvN-tQpX0)
Developers can now use NVIDIA CUDA Tile programming within large existing C++ GPU codebases to develop highly optimized GPU kernels using tile-based... Source
NVIDIA CUDA 13.3 brings new capabilities and performance optimizations to developers across the CUDA ecosystem. The launch of NVIDIA CUDA Tile programming in... NVIDIA CUDA 13.3 brings new capabilities and performance optimizations to developers across the CUDA ecosystem. The launch of NVIDIA CUDA Tile programming in C++, enables high-level, tile-based kernel development that automatically manages complex low-level GPU details for optimal performance and portability. Additionally, CUDA Tile programming is now supported on Compute Capability 9.0… Source
NVIDIA CompileIQ tackles one of the hardest problems in performance engineering: finding the compiler options that unlock the best performance for a specific... NVIDIA CompileIQ tackles one of the hardest problems in performance engineering: finding the compiler options that unlock the best performance for a specific workload. Consider a team that has spent weeks optimizing an LLM inference pipeline on GPUs, tuning batch sizes, quantizing to FP8, adopting flash attention, fusing every kernel they can. The profiler says there’s nothing left to squeeze. Source
I picked up a 7900 XTX earlier which runs qwen3.6-27b fine, but not to my like. Its compute performance is quite unstable for me. With MTP the decode speed can reach 40-60 t/s, but prefill is just too slow. Regardless of whether I used ROCm or Vulkan, the prefill speed varies between 300t/s and 500 t/s, even with very long prompts. I've been itching to try out an ultra-budget 24GB setup using dual 3060s. I managed to snag a second 3060 at a reasonable price in last few days. So I took out the 7900 XTX, installed the 3060s, and began testing. # Test Configuration * **Test Platform:** i7 477...
My company started tracking Claude Code usage - tokens and spend, that kind of thing. Now my manager wants me to stack-rank my engineers on "AI performance" using those numbers. I'm not comfortable with it (but I don't have a choice either). Token usage feels like exactly the wrong proxy - my strongest engineer uses Claude surgically while someone burning 10x the tokens isn't 10x more productive (often the opposite). Ranking on this just teaches people to game the metric. So, for folks here who use Claude daily and/or lead teams: * Has your company started measuring "AI performance"? How a...
I wasn't even on my computer or Claude on May 25 when they said there was suspicious activity. Any ideas about what I should do? I emailed Anthropic a response but am not sure what else I can or should do. I'm furthered worried that if someone who is trying to do something weird on my account has has access to my account...it just makes me feel very concerned and like I am probably not the only case. https://preview.redd.it/k9rlklponj3h1.png?width=1028&format=png&auto=webp&s=de46e3c86a80b1c491d50fcda3a7e7ba7dfa7d76
https://preview.redd.it/yspiafvakj3h1.png?width=1460&format=png&auto=webp&s=9d7bd1777fad8b286a21e75df8ae593d39432a8a Got this message when I tried to continue my chat :/ anyone else?
For real tho, 9b, 27b, 122b, I don’t really care at this point, just show us that you still love us.
"BadHost" was found in Starlette, a package with 325 million weekly downloads.
The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0! Official collection on HF: [https://huggingface.co/collections/prism-ml/bonsai-image](https://huggingface.co/collections/prism-ml/bonsai-image) Link to demo: [https://huggingface.co/spaces/webml-community/bonsai-image-webgpu](https://huggingface.co/spaces/webml-community/bonsai-image-webgpu)
https://preview.redd.it/j0ymp70a2j3h1.png?width=746&format=png&auto=webp&s=4cdb70be13ccc99f5ea57556da96d6d81e61d702 i just realize the removed Sonnet 4.5, does that mean the sonnet 4.8 (maybe Opus 4.8 too?) cooming soon? maybe today or tommorow, excited to see new claude model, hope anthropic actually ship really good model this time. What are your assumptions?
About an hour ago, my desktop app began to crap out and I suddenly didn’t have access to my projects or chats anymore. (I’m on my own business plan.) My UI then refreshed with someone else’s chat history where I could click in and read all conversations end to end. Because I did not want to read personal information any further, I had to quit and restart the app before my own personal information populated back into the UI. What gives? If this happened to another person’s data, perhaps it is happening to yours, or mine. Has anyone else had this issue?