The AI Infrastructure War Is Now a Cloud Contract War

Table of content

SpaceX Is Now an AI Compute Provider — and Anthropic Is Paying $1.25B/Month for It
Google I/O 2026: A Lot of Announcements, Not Much You Can Touch Yet
Supply Chain Risk in AI Tooling: Malware in PyTorch Lightning
LLM Token Speed: What the Numbers Actually Mean
The Week’s Throughline

The AI Infrastructure War Is Now a Cloud Contract War

by Founder @devroaks

SpaceX Is Now an AI Compute Provider — and Anthropic Is Paying $1.25B/Month for It
Google I/O 2026: A Lot of Announcements, Not Much You Can Touch Yet
Supply Chain Risk in AI Tooling: Malware in PyTorch Lightning
LLM Token Speed: What the Numbers Actually Mean
The Week’s Throughline

SpaceX Is Now an AI Compute Provider — and Anthropic Is Paying $1.25B/Month for It

In a freshly filed SpaceX S-1, a detail slipped through that deserves more attention than it’s getting: Anthropic has signed Cloud Services Agreements with SpaceX, committing to pay $1.25 billion per month for compute access across COLOSSUS and COLOSSUS II — SpaceX’s Elon Musk-backed supercomputer clusters — through May 2029. Capacity is ramping through May and June 2026 at a reduced rate, with either party able to exit on 90 days’ notice.

To put that number in perspective: $1.25B/month is $15B/year in raw compute spend, from a single vendor. That’s not a pilot program or a research budget line — that’s a strategic infrastructure bet.

What does this tell us technically?

Training at this scale requires bespoke arrangements. The public cloud (AWS, Azure, GCP) simply cannot guarantee the contiguous GPU/TPU allocations needed for frontier model training runs. COLOSSUS II’s purpose-built architecture — originally designed for Grok 5 training — is being time-shared with Anthropic.
The inference cliff is real. The compute required to serve frontier models at scale is now approaching the cost of training them. Anthropic’s API traffic is significant enough that leasing dedicated clusters is economically competitive with spot instances.
Vendor lock-in is a two-way street. SpaceX’s S-1 notes the agreement is terminable on 90 days’ notice — a surprisingly short leash for this level of spend. That clause likely reflects Anthropic’s negotiating leverage (or hedging) as alternative compute providers mature.

The broader signal: the AI infrastructure layer is bifurcating. Hyperscalers handle commodity inference; bespoke clusters handle frontier training and high-throughput serving. Expect more deals like this.

Google I/O 2026: A Lot of Announcements, Not Much You Can Touch Yet

Google I/O was, by most accounts, overwhelming in breadth and underwhelming in immediate availability. A few things worth parsing technically:

Gemini Spark: The Agentic Play

Google’s most consequential I/O announcement is Gemini Spark, a personal AI agent that plugs natively into Gmail, Calendar, Drive, Docs, Sheets, Slides, YouTube, and Maps. It’s positioned as a direct competitor to OpenClaw-style agents, and it’s launching first for AI Ultra subscribers at $100/month.

The security architecture is worth scrutinizing. Google’s enterprise documentation describes Spark as running in:

“a fully managed, secure runtime on Google Cloud… Every task executes in a fresh, strictly isolated, ephemeral VM to help ensure data never overlaps between sessions. All traffic routes through our secure Agent Gateway that enforces Data Loss Prevention (DLP) policies, while user credentials remain fully encrypted and are never exposed directly to the agent.”

This is the right architecture on paper — ephemeral VMs, credential abstraction, DLP enforcement. But the threat model for a personal agent with Gmail access is brutal. Prompt injection via a malicious email is a trivially simple attack vector. The question isn’t whether Google’s infrastructure is secure; it’s whether the model can be manipulated into acting on injected instructions embedded in content it reads. That’s an alignment and evaluation problem, not an infra problem.

Notably, Gemini Spark runs on Gemini 3.5 Flash and Antigravity — the latter being Google’s new closed-source Go-based agent runtime. The open-source Gemini CLI (Apache 2.0 TypeScript) is being deprecated on June 18th in favor of the Antigravity CLI. That’s a significant philosophical pivot: Google is closing the stack at the agent layer.

The Search Box Redesign

The NYT led with the fact that Google is widening its search bar for the first time since 2001. This sounds trivial but signals something meaningful: Google is institutionally acknowledging that the query paradigm has changed. Users now type multi-sentence natural language questions, attach images and video, and expect dialogue — not a list of blue links. The UX is catching up to the model capability.

The subtler story, as several commentators noted, is what this means for the open web. Google’s agentic search doesn’t just answer your apartment-hunting query — it subscribes to listings and notifies you without you ever visiting Zillow. That’s a structural threat to any site that currently earns traffic from search intent.

Gemini 3.5 Flash

Gemini 3.5 Flash is in general availability. Simon Willison has already noted that this is one of the few I/O announcements you can actually test today. Most of the headline features — Spark, Project Aura smart glasses, the revamped Search — are “coming soon.” The developer story for Gemini 3.5 Flash is worth watching, particularly its positioning in latency-sensitive agentic pipelines.

Supply Chain Risk in AI Tooling: Malware in PyTorch Lightning

A Semgrep security disclosure trending on Hacker News this week: malicious code was found embedded in the PyTorch Lightning library, a widely used training abstraction layer. The dependency was used in AI training pipelines across the ecosystem before detection.

This is a recurring pattern that deserves more systemic attention. The AI/ML toolchain has an enormous attack surface:

Training frameworks with complex C++/CUDA extension chains
Hub-style model distribution (Hugging Face, PyPI) with limited signing infrastructure
Notebook-first workflows where pip install is reflexive

The irony is that as AI systems become more autonomous and start writing and executing their own code (vibe-coding, agents), the blast radius of a compromised dependency grows proportionally. An agent that auto-installs packages from model-generated code is a supply chain attack waiting to happen.

For teams running serious ML infrastructure: pin your dependencies, use reproducible builds, and treat your training environment with the same paranoia you’d apply to production auth services.

LLM Token Speed: What the Numbers Actually Mean

A small but useful tool making the rounds: tokenspeed by Mike Veerman lets you viscerally feel the difference between 10, 30, 100, and 800 tokens/second by simulating live output at each rate.

This matters more than it might seem. Latency perception in LLM interfaces is non-linear. 10 tok/s feels painfully slow for code completion but is fine for a long-form essay you’re reading. 800 tok/s is invisible — faster than human reading comprehension. The useful range for most interactive use cases is roughly 30–150 tok/s, which is where most hosted inference APIs land today.

When evaluating models for production use, token throughput is often more important than raw benchmark scores. A model that scores 5% higher on MMLU but runs at 40 tok/s will feel worse than a slightly weaker model at 120 tok/s in any latency-sensitive application.

The Week’s Throughline

The story across all of these data points is the same: the AI layer is moving from research artifact to infrastructure primitive, and all the classic infrastructure problems are showing up on schedule — supply chain security, vendor concentration risk, adversarial inputs at the application layer, and the ongoing tension between openness and control.

Google is closing the Gemini CLI stack. Anthropic is signing billion-dollar compute contracts with SpaceX. Malware in training libraries. These aren’t isolated news items; they’re the predictable growing pains of a technology transitioning from lab to production at a massive scale.

The developers who will navigate this well are the ones treating AI components with the same rigour they’d apply to any other critical dependency: threat model it, pin it, monitor it, and have a fallback.

Sources: Simon Willison’s Weblog, Hacker News, Daring Fireball, SpaceX S-1 (SEC EDGAR)

“Devroaks is an exceptional development team, combining strong technical & problem-solving skills. Highly recommended.””

Mozin omer Founder @devroaks

Book a call

The Floor Rises, the Ceiling Closes: Tech’s Bifurcated June 2026

How can we help you?

Table of content

The AI Infrastructure War Is Now a Cloud Contract War

SpaceX Is Now an AI Compute Provider — and Anthropic Is Paying $1.25B/Month for It

Google I/O 2026: A Lot of Announcements, Not Much You Can Touch Yet

Gemini Spark: The Agentic Play

The Search Box Redesign

Gemini 3.5 Flash

Supply Chain Risk in AI Tooling: Malware in PyTorch Lightning

LLM Token Speed: What the Numbers Actually Mean

The Week’s Throughline

Related Posts

AI Agents Are Everywhere Right Now — Here’s What the Tech World Is Saying

The Floor Rises, the Ceiling Closes: Tech’s Bifurcated June 2026

Siri AI & Agentic Commerce: What Shopify Merchants Must Know

Mozin omer Founder @devroaks

Related Posts

Siri AI & Agentic Commerce: What Shopify Merchants Must Know

The AI Infrastructure War Is Now a Cloud Contract War

15 AI-Powered Website Design Ideas for 2026

How Smart Businesses Are Using AI to Eliminate Busywork (Without Breaking Everything)

Why the WordPress Tech Stack Is Perfect for Modern Websites

Have a Project?
Let’s talk!

Schedule a call:

Mozin Omer Founder @devroaks

How can we help you?

Muneeb Shahid

Zohaib

Mozin Omer

Table of content

The AI Infrastructure War Is Now a Cloud Contract War

SpaceX Is Now an AI Compute Provider — and Anthropic Is Paying $1.25B/Month for It

Google I/O 2026: A Lot of Announcements, Not Much You Can Touch Yet

Gemini Spark: The Agentic Play

The Search Box Redesign

Gemini 3.5 Flash

Supply Chain Risk in AI Tooling: Malware in PyTorch Lightning

LLM Token Speed: What the Numbers Actually Mean

The Week’s Throughline

Related Posts

AI Agents Are Everywhere Right Now — Here’s What the Tech World Is Saying

The Floor Rises, the Ceiling Closes: Tech’s Bifurcated June 2026

Siri AI & Agentic Commerce: What Shopify Merchants Must Know

Mozin omer Founder @devroaks

Related Posts

The Floor Rises, the Ceiling Closes: Tech’s Bifurcated June 2026

Siri AI & Agentic Commerce: What Shopify Merchants Must Know

The AI Infrastructure War Is Now a Cloud Contract War

15 AI-Powered Website Design Ideas for 2026

How Smart Businesses Are Using AI to Eliminate Busywork (Without Breaking Everything)

Why the WordPress Tech Stack Is Perfect for Modern Websites

Have a Project? Let’s talk!

Schedule a call:

Mozin Omer Founder @devroaks

How can we help you?

Muneeb Shahid

Zohaib

Mozin Omer

Have a Project?
Let’s talk!