News Summary for May 25, 2026

Summary

Today’s news is dominated by three interlocking themes: AI security, agentic AI maturation, and cost disruption. On the security front, Anthropic’s Project Glasswing (Claude Mythos) uncovered 23,000+ vulnerabilities across open-source projects, while a massive GitHub supply chain attack (‘Megalodon’) infected 5,500+ repositories and researchers warned of an AI-powered bug-hunting arms race. On the agentic AI front, Google unveiled Gemini for Science targeting 3 million researchers, Microsoft released MagenticLite for small models, and community discussion is accelerating around the shift from chatbots to autonomous agents. Cost disruption continues via DeepSeek — both through V4 Pro’s dramatically lower pricing (11–34x cheaper than OpenAI/Anthropic) and Reasonix, an open-source coding agent achieving 80% cost savings through cache-aware architecture. Across all topics, a clear signal emerges: AI is moving fast enough that even leading companies are navigating security, pricing, and deployment challenges in real time.

Top 3 Articles

1. DeepSeek Reasonix: DeepSeek Native Coding Agent with High Caching and Low Cost

Source: Hacker News
Date: May 24, 2026

Detailed Summary:

Reasonixi is an open-source, MIT-licensed AI coding agent for the terminal, purpose-built around DeepSeek’s prefix-cache architecture. Unlike general-purpose coding agents that treat caching as optional, Reasonix makes cache stability a core invariant — every decision in its execution loop is engineered to ensure DeepSeek’s KV prefix cache remains hit across long sessions.

The agent is built on three pillars: a cache-first loop that maintains byte-stable prefixes between turns, a tool-call repair layer that intercepts malformed outputs from reasoning models, and built-in cost controls including token budgeting and an /effort knob for adjusting reasoning depth. A real-world case study from May 1, 2026 demonstrates the value: 435 million input tokens processed in a single day at a 99.82% cache hit rate, costing ~$12 versus ~$61 without caching on deepseek-v4-flash — an 80% cost reduction through infrastructure-level optimization alone.

Beyond caching, Reasonix offers first-class MCP (Model Context Protocol) support across stdio, SSE, and Streamable HTTP transports; a plan mode with /todo tracking; a skills system compatible with Claude-format SKILL.md files; persistent typed memory pinned into the prefix cache; and shell lifecycle hooks enabling CI/CD-style gating. A SEARCH/REPLACE review workflow ensures no file writes occur until the user explicitly runs /apply. The project is cross-platform (macOS, Linux, Windows), requires Node.js ≥ 22, and a paid DeepSeek API key.

Strategically, Reasonix validates DeepSeek’s prefix-cache API design as a genuine ecosystem differentiator and signals that MCP — originally from Anthropic — is becoming a cross-ecosystem standard for AI agent tool protocols. Its deliberate support for Claude-format skills lowers migration friction for Claude Code users. The 80% cost reduction claim, if broadly reproducible, is one of the most compelling AI cost management data points published this week.

2. Everyone is navigating AI security in real time — even Google

Source: TechCrunch
Date: May 24, 2026

Detailed Summary:

TechCrunch Editor-in-Chief Connie Loizos interviews Google Cloud COO Francis de Souza at a Los Angeles event, producing a nuanced portrait of the AI security landscape — one that juxtaposes de Souza’s strategic advice against documented failures in Google’s own platform.

De Souza’s core argument is that security cannot be a retrofit: “There’s no such thing as an AI strategy without a data strategy and a security strategy. They need to go hand in hand.” He warned specifically about shadow AI — employees adopting consumer AI tools without organizational oversight — and called for multicloud-consistent security postures, noting that SaaS vendors and business partners inevitably introduce additional cloud surfaces. A critical data point: the average time between an initial breach and handoff to the next attack stage has dropped from 8 hours to 22 seconds, making human-speed defense architecturally untenable. His prescribed solution is fully agentic AI-native defense, where humans oversee machine-driven security rather than leading it directly. He also flagged a hidden risk specific to agentic AI: agents roaming enterprise systems will discover and surface forgotten legacy data repositories that were previously “safe” only through obscurity.

The article’s sharpest edges come from its counter-narrative. Google Cloud developers were hit with five-figure unauthorized Gemini API bills after Google quietly expanded API key scopes (originally for Google Maps) to include Gemini without clear disclosure — one CEO received a $10,138 bill in ~30 minutes, another a ~AUD $17,000 charge despite believing a $250 cap was in place. Security firm Aikido found that even after deleting a compromised API key, attackers can continue using it for up to 23 minutes due to slow propagation of revocation across Google’s infrastructure — and noted this is a prioritization choice, not a technical constraint, since newer credential formats revoke in seconds. Google refunded affected developers but stated no plans to change its automatic billing tier-upgrade policy.

The takeaway: de Souza’s advice is sound, but there is a meaningful gap between what hyperscalers prescribe and how quickly they implement it themselves — a candid observation with direct implications for any enterprise relying on Google Cloud AI tooling.

3. Google just dropped Gemini for Science — aiming at 3 million researchers

Source: Reddit r/ArtificialIntelligence
Date: May 25, 2026

Detailed Summary:

At Google I/O 2026, DeepMind CEO Demis Hassabis unveiled Gemini for Science — a strategic pivot from narrow specialized models toward general-purpose agentic LLM systems for scientific research, explicitly targeting the 3 million+ researchers currently using AlphaFold worldwide.

The initiative comprises three experimental tools available via Google Labs: Hypothesis Generation (built on Co-Scientist), which uses a multi-agent “idea tournament” to synthesize millions of papers and generate grounded, citation-linked research hypotheses; Computational Discovery (built on AlphaEvolve + ERA), an agentic engine that generates and scores thousands of code variations in parallel to test novel modeling approaches at previously impossible scales; and Literature Insights (built on NotebookLM), which searches scientific literature and structures results into interactive tables with chat interfaces, gap identification, and artifact generation. Research papers for both ERA and Co-Scientist were published simultaneously in Nature.

Enterprise deployment is live in private preview on Google Cloud, with named partners including BASF (supply chain optimization), Klarna (ML enhancement), Daiichi Sankyo, Bayer Crop Science, and the U.S. Department of Energy. Over 100 academic institutions are collaborating, and Google is piloting agentic peer review tools with ICML, NeurIPS, and STOC. A specialized Science Skills bundle integrates 30+ major life science databases including UniProt, AlphaFold Database, AlphaGenome API, and InterPro into agentic workflows.

The strategic core of the announcement is Google’s explicit architectural shift: away from siloed domain-specific models and toward general agentic systems, with Gemini’s LLM backbone as the unifying layer. Hassabis framed the moment as humanity standing at “the foothills of the singularity” with a goal to “solve all disease.” Critics noted the tension between these grand claims and Google’s choice to relegate Gemini for Science to the final minutes of a 100+ minute keynote dominated by consumer AI features — with CNET calling it “a Hail Mary to end on a positive note.” Nonetheless, the deep integration of biological databases, enterprise deployments, and academic partnerships represents a credible and significant competitive moat in scientific AI.

Other Articles

Over 5,500 GitHub Repositories Infected in ‘Megalodon’ Supply Chain Attack
- Source: SecurityWeek
- Date: May 25, 2026
- Summary: A large-scale supply chain attack dubbed ‘Megalodon’ infected over 5,500 GitHub repositories on May 18 via automated fake commits. Attackers injected malicious GitHub Actions workflows that exfiltrated AWS credentials, GCP tokens, Azure credentials, SSH keys, Docker/Kubernetes configs, API keys, and CI/CD secrets — all within a 6-hour window using two email addresses making 5,718 commits.
Constraint Decay: The Fragility of LLM Agents in Backend Code Generation
- Source: Hacker News (arXiv)
- Date: May 24, 2026
- Summary: New research reveals ‘constraint decay’: as structural requirements accumulate in backend code generation tasks, LLM agent assertion pass rates drop by 30 points on average. Studied across 80 greenfield and 20 feature-implementation tasks over 8 web frameworks, agents perform significantly worse in convention-heavy frameworks (FastAPI, Django) than minimal ones (Flask), with data-layer defects as the leading root cause.
Claude is not your architect. Stop letting it pretend
- Source: hollandtech.net
- Date: May 25, 2026
- Summary: A critical examination of how developers are misusing Claude and other LLMs as software architects. Argues that AI coding assistants should be treated as tools, not decision-makers, and that humans must retain architectural ownership to avoid accumulating technical debt and design failures.
Project Glasswing: An initial update
- Source: Anthropic
- Date: May 22, 2026
- Summary: Anthropic’s Project Glasswing initial update reveals Claude Mythos Preview found 10,000+ high/critical vulnerabilities across critical open-source software in one month. Cloudflare found 2,000 bugs with better-than-human false positive rates; Mozilla fixed 271 Firefox vulnerabilities (10x more than usual); Palo Alto Networks released 5x more patches than typical. Anthropic scanned 1,000+ OSS projects finding 6,202 high/critical vulnerabilities.
Anthropic: Mythos Detected 23,000 Potential Vulnerabilities Across 1,000 OSS Projects
- Source: SecurityWeek
- Date: May 25, 2026
- Summary: SecurityWeek’s coverage of Project Glasswing confirms 23,000+ potential vulnerabilities identified across 1,000+ open-source projects. Of 1,900 reviewed by external security firms, 1,726 were confirmed — over 1,000 rated high or critical severity. Mozilla and Palo Alto Networks are among the organizations that have already patched findings.
Are we moving past the “Chatbot” era faster than people realize?
- Source: Reddit r/ArtificialIntelligence
- Date: May 25, 2026
- Summary: Community discussion on the rapid shift from text-generating chatbots to AI agents that execute tasks autonomously. With major leaps in model reasoning and agentic workflows from Google, OpenAI, and Anthropic, the chatbot paradigm is starting to look primitive as autonomous agents become the new default.
MagenticLite: An agentic experience optimized for small models
- Source: Microsoft Research
- Date: May 21, 2026
- Summary: Microsoft Research released MagenticLite, redesigning Magentic-UI to work efficiently on small language models. Powered by MagenticBrain (14B-parameter orchestration model fine-tuned from Qwen 3 14B) and Fara1.5 (computer-use model in 4B, 9B, and 27B sizes), it works across browsers and local file systems and sets new SOTA results for small computer-use models.
The AI security gap nobody wants to admit is already here
- Source: The Next Web
- Date: May 24, 2026
- Summary: Analysis of the Anthropic Claude Code source code leak in March 2026 — which accidentally shipped 512,000 lines of TypeScript to the public npm registry — as a case study in the fundamental gap between AI companies’ safety claims and actual AI security practices. Highlights how attackers are moving faster than defenders with AI tools.
Genkit Middleware
- Source: DZone
- Date: May 18, 2026
- Summary: A deep dive into Google’s Genkit Middleware system for JavaScript/TypeScript, introducing a composable middleware layer for the generate() pipeline. Covers built-in middlewares including filesystem, skills, toolApproval, retry, and fallback, plus how to build custom middleware for production AI agent patterns.
How Retry Storms Crash API-Led Systems
- Source: DZone
- Date: May 22, 2026
- Summary: Explores how unbounded retries combined with autoscaling can escalate minor latency into cascading outages in API-led architectures. Covers why API reliability must be bounded and load-aware, and best practices for preventing retry storms in distributed systems.
DeepSeek just popped the American AI bubble
- Source: Reddit r/ArtificialIntelligence
- Date: May 24, 2026
- Summary: Analysis of DeepSeek V4 Pro’s pricing disruption: at $0.435/$0.87 per 1M tokens vs. OpenAI GPT-5.5’s $5/$30 and Claude Opus 4.7’s $5/$25, DeepSeek is 11.5x cheaper on input and 34.5x cheaper on output. Community discussion on how this pricing destroys the assumption of unlimited AI pricing power.
Hands-On Component-Based Development on Azure ML — From Component to Pipeline
- Source: Level Up / GitConnected
- Date: May 22, 2026
- Summary: A practical tutorial on building modular ML workflows using Azure Machine Learning’s component-based development model, covering how to define reusable ML components, chain them into pipelines, and leverage Azure ML’s managed infrastructure for streamlined model training and deployment at scale.
Memory has grown to nearly two-thirds of AI chip component costs
- Source: Hacker News (Epoch AI)
- Date: May 24, 2026
- Summary: Epoch AI analysis shows high-bandwidth memory (HBM) grew from 52% to 63% of total AI chip component spending between Q1 2024 and Q4 2025 across Nvidia, AMD, Google, and Amazon chips. In absolute terms, HBM spend grew from ~$12B to $32B in 2025. The trend is expected to accelerate in 2026 as memory supply remains constrained.
The AI Era Is Creating a Bug Hunting Arms Race
- Source: Wired
- Date: May 25, 2026
- Summary: AI-powered tools are enabling both attackers and defenders to find software vulnerabilities at unprecedented speed. Bad actors use AI to autonomously detect flaws and craft exploits, overwhelming bug bounty programs and reshaping security economics — creating asymmetric pressure particularly on smaller organizations and open-source projects.
PapersWithCode new features - week 1
- Source: Reddit r/MachineLearning
- Date: May 24, 2026
- Summary: Hugging Face shares the first-week update on the revival of PapersWithCode. New features include improved SOTA tracking across AI domains (agents, computer vision, time-series), better paper discovery, and broader ML benchmark coverage. The project is open-source and received strong community reception.
Anthropic to release Mythos-class models to the public
- Source: The Register
- Date: May 25, 2026
- Summary: Anthropic has signaled intent to release models matching Claude Mythos’s bug-finding performance to the general public once sufficient safeguards are developed. Currently limited to ~50 organizations through Project Glasswing, access will first expand to US and allied governments. Anthropic acknowledges ’no company — including Anthropic — has developed safeguards strong enough to prevent such models from being misused.’
OpenAPI From Code With Spring and Java: A Recipe for Your CI
- Source: DZone
- Date: May 19, 2026
- Summary: A guide to generating OpenAPI documentation directly from Spring and Java code in a way that integrates cleanly into CI pipelines. Covers automated spec generation, validation, and best practices for keeping API documentation in sync with code.
Perplexity Is Open-Sourcing Bumblebee
- Source: Perplexity AI
- Date: May 22, 2026
- Summary: Perplexity AI open-sourced Bumblebee, a read-only security scanner used internally to protect developer machines from supply-chain threats. Bumblebee scans npm, PyPI, Go modules, RubyGems, IDE/browser extensions, and AI agent configs including MCP JSON. It never invokes package managers or runs install scripts, preventing the scanner itself from triggering attacks. Available as an open-source Go project for macOS and Linux.
Migrating from Go to Rust
- Source: corrode.dev
- Date: May 21, 2026
- Summary: A comprehensive guide for Go developers considering migration to Rust, written by a Rust consultant with production Go experience. Covers how Go patterns map to Rust constructs, the borrow checker’s benefits, when to keep Go versus migrate to Rust, and incremental migration strategies for backend services.
Don’t Roll Your Own
- Source: Hacker News
- Date: May 23, 2026
- Summary: Argues that developers should resist building custom implementations of browser-native features — scrolling, link navigation, text selection, context menus, copy/paste, password fields, date pickers — because home-grown implementations degrade user experience and introduce subtle bugs. A thoughtful argument for restraint and using established platform primitives.
Building a Self-Evolving AI Agent with a Local Skill Database in Python
- Source: Level Up / GitConnected
- Date: May 22, 2026
- Summary: A step-by-step guide to building a Python-based AI agent that learns and grows its own skill set by storing acquired capabilities in a local database. Covers agent architecture, skill persistence patterns, and self-evolution mechanisms to make agents more capable over time without retraining.
From the Vatican stage, Anthropic’s Chris Olah says AI cannot be steered by AI labs alone
- Source: The Next Web
- Date: May 25, 2026
- Summary: Anthropic co-founder Christopher Olah spoke at the Vatican alongside Pope Leo XIV during the launch of the Magnifica Humanitas papal encyclical on AI and human dignity. Olah argued that AI development cannot be left solely to technology companies and requires broader oversight from religious institutions and societal stakeholders.

Summary#

Top 3 Articles#

1. DeepSeek Reasonix: DeepSeek Native Coding Agent with High Caching and Low Cost#

2. Everyone is navigating AI security in real time — even Google#

3. Google just dropped Gemini for Science — aiming at 3 million researchers#

Other Articles#

Summary

Top 3 Articles

1. DeepSeek Reasonix: DeepSeek Native Coding Agent with High Caching and Low Cost

2. Everyone is navigating AI security in real time — even Google

3. Google just dropped Gemini for Science — aiming at 3 million researchers

Other Articles