100 things we announced at I/O 2026
Google I/O 2026: Gemini Omni and 99 other announcements; focus on multimodal AI and platform expansions.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Google I/O 2026: Gemini Omni and 99 other announcements; focus on multimodal AI and platform expansions.
Reddit discussion on ML PhD admissions competitiveness and networking requirements across regions.
OpenAI model contributes to proof challenging 80-year-old mathematical conjecture, demonstrating general-purpose reasoning for novel knowledge production.
OpenAI model disproved a longstanding conjecture in discrete geometry, demonstrating AI capability in mathematical research and theorem discovery.
Midjourney reports TPU infrastructure choice delayed research by ~1 year; regrets not exclusive NVIDIA commitment.
Reddit post with unclear title and no content; appears to be incomplete or user error.
CARV framework reduces variance in Monte Carlo gradient estimation for diffusion-model-based pipelines via hierarchical resampling.
Equilibrium Reasoners enable test-time compute scaling via learned task-conditioned attractors without external verifiers.
Framework quantifies hyperparameter transfer across model scales, revealing embedding layer learning rate criticality.
EvoStruct integrates protein language models with equivariant GNNs to fix vocabulary collapse in antibody CDR design.
Velocityformer applies equivariant graph transformers to cosmological kinematic Sunyaev-Zel'dovich velocity reconstruction.
DeepWeb-Bench introduces harder evaluation for frontier LLMs on deep research requiring massive cross-source evidence and reasoning.
AiraXiv proposes AI-era publishing platform enabling human and AI authors with continuous feedback-driven iteration.
WikiVQABench benchmark combines Wikipedia images with Wikidata for knowledge-grounded visual question answering evaluation.
Interactive tool visualizing LLM token generation speeds from 5 to 800 tokens/second for practical latency understanding.
FROG enables learnable graph structure for relational deep learning on RDBs without fixed schema constraints.
Agent JIT compilation compiles task descriptions into executable code for web agents, reducing latency vs. sequential fetch-execute loops.
RLVR training exhibits rank-1 weight trajectory structure; minimal training captures performance gains via linear parameter evolution.
DelTA interprets RLVR updates as linear discriminators over token gradients, explaining token-level probability changes in reasoning model training.
LLM-based grammar adaptation for metamodel evolution in domain-specific languages; evaluated on Xtext DSLs.
Mem-π framework generates context-specific agent guidance on-demand via dedicated model rather than static retrieval-based memory.
ML framework for GNSS positioning error mitigation in urban environments using activation function-based weighted least squares.
HITL-D combines diffusion policies with human control for shared autonomous manipulation, conditioning on scene point clouds.
Framework for optimal simulator-experiment allocation when deploying pre-trained simulators; decomposes value error into calibration drift and parametric residual.
Rubric embeddings mitigate label bias in high-stakes prediction (hiring, admissions) by replacing black-box embeddings with interpretable representations.
Empirical study of AI-generated Python refactoring PRs from AIDev dataset; assesses maintainability, code quality, and security impact.
Survey of approximation theory for neural networks covering universal approximation, quantitative rates, depth/width efficiency over four decades.
Coming to your homescreen soon: your own app. | Photo: Allison Johnson / The Verge "There's an app for that" was the promise of the App Store from the very beginning. The app that will get your phone to do the thing you want it to? It's just a few taps away. The tagline wasn't strictly true - I'm still waiting for that one perfect grocery list app. Still, apps shaped the modern smartphone into what it is today. We spend all day, every day inside of apps - scrolling, listening, and tapping until we find what we want. But your next favorite app might just be one that you made yourself. If you w...
Study of Vision-Language-Action model robustness under sensor degradation in autonomous driving; Alpamayo R1 tested across 18K trials with noise, lighting, fog perturbations.
Benchmark distinguishing temporal vs. spatial glitch detection in VLMs for game quality assurance; finds temporal glitches substantially harder than frame-level anomalies.