The AI Landscape, April 2026: What You Missed While You Were Away

You stepped away for spring break. Maybe a week. Maybe ten days. You come back to three different vendors claiming the top benchmark spot, a new Anthropic model, Google shipping open models that run on a laptop, and an open-source personal AI assistant with 359,000 GitHub stars. The field didn’t wait.

This is the AI landscape as it stands on April 17, 2026—a catalog of what’s available, who makes it, and what it costs. Consider it a reference point. The next time you look away, at least you’ll know what you’re looking away from.

The Three-Way Tie at the Frontier

For eighteen months, someone always held the lead. GPT-4 gave way to Claude 3. Claude 3 gave way to Gemini 2. Gemini 2 gave way to GPT-5. The benchmarks moved, but there was always a leader.

Not anymore. According to Artificial Analysis—the independent benchmarking platform that tracks 476 models across intelligence, speed, and price—three models now share the top position on the Intelligence Index v4.0:

Claude Opus 4.7 (Anthropic): Score 57
Gemini 3.1 Pro Preview (Google): Score 57
GPT-5.4 (OpenAI): Score 57

The Index incorporates ten evaluations including GDPval-AA, Terminal-Bench Hard, SciCode, Humanity’s Last Exam, and GPQA Diamond. It’s the closest thing to a neutral yardstick the industry has. And for the first time, three companies are standing on the same rung.

Below the tie, the spread is instructive. Meta’s Muse Spark and Claude Sonnet 4.6 sit at 52. Zhipu’s GLM-5.1 reaches 51. xAI’s Grok 4.20 hits 49. Gemini 3 Flash lands at 46. DeepSeek V3.2 scores 42. The gap between the frontier and the rest has narrowed, but it hasn’t closed.

Anthropic: From Models to Ecosystem

Anthropic shipped Claude Opus 4.7 yesterday—April 16, 2026. The pricing held at $5 input / $25 output per million tokens, unchanged from Opus 4.6. The capability did not hold; it moved.

Claude Opus 4.7 is Anthropic’s most capable generally available model. The company describes it as a “notable improvement on Opus 4.6 in advanced software engineering,” with particular gains on the hardest tasks. On CursorBench, Opus 4.7 scored 70% versus Opus 4.6 at 58%—a 12-point jump, according to Cursor CEO Michael Truell. On Rakuten-SWE-Bench, Opus 4.7 resolves three times more production tasks than its predecessor.

“Claude Opus 4.7 takes long-horizon autonomy to a new level in Devin. It works coherently for hours, pushes through hard problems rather than giving up, and unlocks a class of deep investigation work we couldn’t reliably run before.” — Scott Wu, CEO, Cognition

The model introduces a new “xhigh” effort level for extended reasoning and substantially better vision processing with higher resolution. Context window: 1 million tokens. Maximum output: 128,000 tokens. Available via Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry.

Claude Mythos Preview sits above Opus 4.7 in capability but below it in availability. It’s offered only through Project Glasswing—Anthropic’s defensive cybersecurity initiative announced April 7, 2026. Partners include AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. There is no self-serve access. Mythos is for finding vulnerabilities before attackers do; Anthropic doesn’t want it finding vulnerabilities for attackers.

Claude Sonnet 4.6 remains the workhorse tier: $3/$15 per million tokens, 1M context, 64K max output. It supports extended thinking—the explicit chain-of-thought mode introduced in late 2025. For tasks that need reasoning but not Opus-level capability, Sonnet remains the default recommendation.

Claude Haiku 4.5 is the speed tier: $1/$5 per million tokens, 200K context, 64K max output. It’s the model for high-volume, lower-complexity work—classification, routing, summarization at scale.

Beyond models, Anthropic now ships a product layer:

Claude Code: Coding assistant integrated into terminal and IDE workflows
Claude Code Enterprise: Organizational version with admin controls
Claude Code Security: Vulnerability detection, presumably connected to Glasswing
Claude Cowork: The agent that runs alongside you—persistent, context-aware, task-oriented
Claude for Chrome / Slack / Excel / PowerPoint / Word: Integrations into existing workflows
Claude Design: Announced April 17, 2026—Anthropic Labs product for visual work
Skills and Marketplace: User-created capabilities that extend Claude’s reach

The Pro tier runs $17/month (annual) or $20/month and includes Claude Code, Cowork, Research mode, and the Office integrations in beta. The Max tier starts at $100/month for 5x or 20x more usage and early access to advanced features.

Google: Gemini, Gemma, and Antigravity

Google’s frontier entry is Gemini 3.1 Pro Preview—tied for the top benchmark spot at 57 on the Intelligence Index. It processes 130 tokens per second with a latency competitive with Opus 4.7’s 52 tokens per second. The Preview designation suggests general availability is imminent.

Gemini 3 Flash scores 46 on the Intelligence Index at $1.10 per million tokens—approximately one-ninth the cost of Opus 4.7. For applications where the top benchmark point isn’t necessary, Flash offers a compelling trade-off.

On the open-weight side, Google announced Gemma 4 in April 2026 with the tagline “Byte for byte, the most capable open models.” The family is purpose-built for advanced reasoning and agentic workflows:

Gemma 4 E2B / E4B: Tiny variants for mobile and IoT
Gemma 4 26B / 31B: Desktop-class variants that run on personal computers with 18+ GB RAM

The Gemma ecosystem includes specialized variants: T5Gemma, MedGemma (1.5 4B launched January 2026), ShieldGemma 2, TranslateGemma (55 languages), FunctionGemma, VaultGemma, and EmbeddingGemma. The naming conventions follow the same logic as Claude’s model-tier hierarchy—differentiated by capability, not just size.

Google Antigravity is Google’s agentic development platform—the equivalent of Anthropic’s Claude Code ecosystem. Documentation calls it “our agentic development platform” but detailed feature lists weren’t available at press time.

Other Google AI products worth noting: Nano Banana (image editing), Veo (video generation), Imagen (image generation), Lyria (music), Genie 3 (world models), and Gemini Robotics. The Gemini app now runs on Mac and can generate “interactive simulations and models”—moving beyond chat toward creation.

NotebookLM continues to ship updates. Learn Mode launched as a “personal coding tutor in Google Colab.” A prepay option for the Gemini API arrived for teams that want predictable costs.

OpenAI: The View from 403

OpenAI’s primary web properties returned 403 errors during research for this piece, limiting what we could independently verify. The brief provides the following, which we note with lower confidence:

GPT-5.4 (xhigh) ties for the top benchmark position at 57 on the Intelligence Index. The “xhigh” designation appears to reference an effort or compute tier similar to Anthropic’s new xhigh option for Opus 4.7. Beyond the benchmark tie, specifics on pricing, context window, and availability were not independently confirmed.

The pitch document references Codex (coding), Atlas (agent orchestration), and a Superapp (unified interface) as OpenAI products, but these could not be verified against primary sources.

Meta: Muse Spark and Open Models

Meta announced Muse Spark on April 8, 2026, with the headline “Scaling Towards Personal Superintelligence.” The model scores 52 on the Intelligence Index—below the three-way frontier tie but above the mid-tier.

Meta’s strategy remains open-weight distribution. While commercial providers charge per token, Meta releases models that others can run—and train on. SAM 3.1 launched March 27, 2026, with multiplexing and global reasoning capabilities. SAM Audio, released December 2025, was the first unified multimodal model for audio separation.

On the infrastructure side, Meta announced four MTIA (Meta Training and Inference Accelerator) chips in two years on March 11, 2026—a vertical integration play to reduce dependence on NVIDIA.

xAI: Grok at the Frontier’s Edge

Grok 4.20 0309 v2 scores 49 on the Intelligence Index and runs at 166 tokens per second—faster than the tied-for-first models. Grok 4.1 Fast offers a 2 million token context window, the second-largest tracked by Artificial Analysis.

The pitch document describes multi-agent tiers (4-agent vs. 16-agent configurations) and deep X/Twitter integration, but x.ai returned 403 errors during research. We cannot independently verify the architecture claims.

DeepSeek: Frontier-Adjacent at a Fraction of the Price

DeepSeek V3.2 officially released with strengthened agent capabilities and integrated reasoning. The model scores 42 on the Intelligence Index—fifteen points below the frontier but at approximately $0.30 per million tokens (blended), or roughly one-thirty-third the cost of Claude Opus 4.7.

For organizations where the last fifteen benchmark points aren’t worth thirty times the cost, DeepSeek represents a serious alternative. The company’s research lineup includes DeepSeek R1 (reasoning), DeepSeek Coder V2, DeepSeek VL (vision-language), and DeepSeek Math.

Open Source: OpenClaw at 359K Stars

OpenClaw crossed 359,000 GitHub stars—up from 346,000 cited in earlier reports. The repository shows 73,200 forks and 31,984 commits. It describes itself as “a personal AI assistant you run on your own devices.”

The project supports 23+ messaging channels: WhatsApp, Telegram, Slack, Discord, Google Chat, Signal, iMessage, BlueBubbles, IRC, Microsoft Teams, Matrix, Feishu, LINE, Mattermost, Nextcloud Talk, Nostr, Synology Chat, Tlon, Twitch, Zalo, WeChat, QQ, and WebChat. Voice listening works on macOS, iOS, and Android. A Canvas feature enables live collaborative work.

OpenClaw runs locally. You bring your own model—whether API-connected (Claude, GPT, Gemini) or local (Gemma, Llama). An AgentSkills directory provides preconfigured capabilities. The project requires Node 24 (recommended) or Node 22.16+ and offers a Gateway daemon via launchd/systemd for persistent operation. MIT licensed.

At 359K stars, OpenClaw is now one of the most-starred AI projects on GitHub. For users who want control over their data and infrastructure, it’s the default starting point.

The Price-Performance Landscape

Artificial Analysis provides the clearest price comparison across the field:

Model	Intelligence Index	Price (per MTok)	Speed (tokens/sec)
Claude Opus 4.7	57	$10 (blended)	52
Gemini 3.1 Pro	57	—	130
GPT-5.4	57	—	—
Muse Spark	52	—	—
Grok 4.20	49	—	166
Gemini 3 Flash	46	$1.10	—
DeepSeek V3.2	42	$0.30	—
gpt-oss-120B	33	$0.30	201

The fastest model tracked is Mercury 2 at 664 tokens per second. The largest context window is Llama 4 Scout at 10 million tokens. The cheapest is Qwen3.5 0.8B at $0.02 per million tokens.

What to Watch

The three-way tie at the top won’t last. Someone will ship a point release or a new model family and pull ahead—for a month, maybe two. The interesting question isn’t who wins the next benchmark cycle; it’s whether the benchmarks still measure what matters.

The shift from “talk to a chatbot” to “work with an agent” is complete. The products that matter now aren’t the ones with the highest benchmark scores; they’re the ones that remember what you did yesterday, integrate into your workflow, and run the task without you watching. The model is becoming infrastructure. The agent is becoming the interface.

If you looked away for spring break, you’re looking back at a field that has moved. This landscape is the map. Next time, the map will be different.