Newsletter Archive

Thursday, April 16, 2026

Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM · Meta and Broadcom Agree to Mega-Deal to Co-Develop AI Chips · Gemini 3.1 Flash TTS                                                                                
AI Pulse AI Pulse Daily Thursday, April 16, 2026
Accelerating decode-heavy LLM inference with speculative decoding on AWS Trainium and vLLM

AWS demonstrates speculative decoding on Trainium chips can accelerate LLM token generation by 3x for decode-heavy workloads using vLLM, cutting inference costs per token. The post includes benchmarks with Qwen3, tuning guidance, and reproducible steps for engineers building generative AI apps.

Amazon AWS ML · 9 min read Tools

Meta and Broadcom Agree to Mega-Deal to Co-Develop AI Chips

Meta and Broadcom are co-developing custom AI chips in a bid to break Nvidia's grip on AI compute. The deal joins a wave of major AI labs racing to build in-house silicon alternatives.

AI Business · 1 min read Hardware

Gemini 3.1 Flash TTS

Google released Gemini 3.1 Flash TTS, a text-to-speech model controllable via detailed prompts that specify voice characteristics, accent, delivery style, and emotional tone. The prompting system is surprisingly theatrical—examples include full character briefs and director's notes to shape everything from vocal brightness to speech cadence.

Simon Willison · 3 min read Tools

OpenAI updates its Agents SDK to help enterprises build safer, more capable agents

OpenAI expanded its Agents SDK with new safety and capability features as enterprise adoption of agentic AI accelerates. The update helps developers build more reliable autonomous systems at scale.

TechCrunch · 3 min read Tools

microsoft/markitdown: Python tool for converting files and office documents to Markdown.

Microsoft's MarkItDown converts files and Office documents to markdown for LLM consumption, now with a Model Context Protocol server for direct Claude Desktop integration. Recent updates switch to stream-based processing and optional dependency groups.

GitHub Trending · GitHub repo Repos

Anthropic Draws Investor Offers at Over $800 Billion Value | Bloomberg Tech 4/15/2026

Anthropic rebuffed $800B+ valuation offers from investors, signaling confidence in independent growth; separately, Meta expanded its Broadcom chip partnership while ASML raised 2026 guidance on surging AI datacenter demand.

Bloomberg Tech · 1 min read Market

The US-China AI gap closed. The responsible AI gap didn’t

Stanford's 2026 AI Index Report finds the US-China performance gap has closed—Anthropic leads by just 2.7% as of March 2026, with Chinese models trading top positions since early 2025—but the more urgent problem is AI safety evaluation lagging far behind model capabilities.

AI News · 6 min read Research

[AINews] RIP Pull Requests (2005-2026)

GitHub now lets users disable pull requests for the first time, reflecting a deeper shift: AI-powered code generation is making traditional PR-based workflows obsolete. Developers are moving toward prompt-based contributions and reputation systems instead, raising the question of whether Git itself survives in an agent-driven future.

Latent Space · 7 min read Community

Quick Hits
Anthropic's AI downgrade stings power users Axios Industry
OpenAI Agents SDK improves governance with sandbox execution AI News Tools
Google Told to Share Search Data With AI Rivals in EU Proposal Bloomberg Tech Policy
Why having “humans in the loop” in an AI war is an illusion MIT Tech Review Policy
TSMC Profit Surges 58% on AI Chip Boom, Beating Estimates - The Tech Buzz TechBuzz Market

Share AI Pulse Daily

Post on X Share on LinkedIn

Got feedback? Just hit reply — we read every response.

You're receiving this because you subscribed to AI Pulse Daily.

Visit AI Pulse  ·  Manage preferences  ·  Unsubscribe

← Back to AI Pulse