CNCF 10 Years: CPU Native vs GPU Native
MAKE CLOUD NATIVE UBIQUITOUS
CNCF (Cloud Native Computing Foundation) turns 10. I published a post when Kubernetes turns 10 in 2024. It's the same sentiment. It’s been a magic journey to be part of these programs for half of their ten-year history. I am honored and humbled for the opportunity to help various roles, such as Local Community Lead, Ambassador, Program Committee Member, and Speaker.
Motivated by Janet Kuo's presentation, 'Kubernetes at 10: A Decade of Community-Powered Innovation,' from the KuberTENes Birthday Bash at Google Mountain View Bay View office, along with the event’s T-shirt design. List my ten KubeCon:
Naming as CNCF Ambassador

(KubeCon EU 2023 Amsterdam, Netherlands. Keukenhof, known as the Garden of Europe, one of the world's largest flower gardens)
I was first named a CNCF Ambassador in late 2022, with the announcement made public during KubeCon Europe 2023 in Amsterdam. I remember at the time, only three Ambassadors were selected from Amazon, one of them was my mentor and colleague, a Principal Engineer in the same VP organization. And I was the only one based in Seattle. After completing my one-year term, I’ve been reappointed as a CNCF Ambassador for another two-year serving term in 2024. It’s a privilege to be recognized and to continue being part of the global Cloud Native community, alongside 154 fellow Ambassadors from 37 countries and 124 companies. This journey has been both unforgettable and deeply meaningful to me.
CPU vs GPU
LLMs have ushered in a new era for GPU-based computing, powering both model training and inference, and paving the way for agentic AI. It feels like the CPU era faded almost overnight. I’ve worked on both sides, actually the founding engineer of two such products at Amazon: on the GPU side, Bedrock; and on the CPU side, quite a few, including App Runner, ECS/Fargate, Lambda, and Elastic Beanstalk.
Few diff between GPU / Other accelerators vs CPU
Peter DeSantis had an excellent keynote in re:Invent 2024 that highlighted the diverse challenges for AI workloads. "One of cool things AI workloads is that they present a new opportunity for our teams to invent in entirely different ways." Peter said.
Kubernetes Community Movement
I had expected Kubernetes to move faster in this space. GPT-3.5 was introduced in late 2022, yet it wasn’t until mid-2024 that the community launched two relevant working groups: WG Serving and WG Accelerator Management, to address and enhance serving workloads on Kubernetes, specifically on hardware-accelerated AI/ML inference.
Google Cloud Run on GPU
Have to admit, Google Cloud Run made a right move. As prev builter of AWS App Runner, a product positioned similarly to Cloud Run, I'm excited to see Cloud Run now on GPUs, as announced at Google Cloud Next 2025. Serverless GPU support is a big deal, it enables Cloud Run to handle large models and opens the door to emerging opportunities in agentic AI, another big deal.
Key Primitives (or Building Blocks)
The fundamentals of serverless remain unchanged. CNCF turns 10 now, Kubernetes, Lambda, ECS and Alexa turns 10 last year. Bedrock and Claude turns 2. Someone says Bedrock is the "Lambda of LLM." I say it is more than that. As I putted in post during Serverless 10 year, serverless continues to play a key role in the LLM world, handling the heavy lifting and delivering real AI/ML value to customers. This principle has held true since before the 'Attention is All You Need' era.
What Comes Next
Firecracker seems back to the stage. Inspired by Jeff Barr’s post, it’s clear that Firecracker lightweight VMs are becoming a enabler option for AI coding assistants or more: agentic AI, allowing users speed development and deployment while running code in protected sandboxes. Companies like E2B are also embracing this approach, providing safe environments for running AI-generated code. As prev builder of Fargate on Firecracker, I’m excited to see this happening.