Curiosity-Driven Builder's Journey
"Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do."
— Steve Jobs
In the era of 𝑨𝒓𝒕𝒊𝒇𝒊𝒄𝒊𝒂𝒍 𝑰𝒏𝒕𝒆𝒍𝒍𝒊𝒈𝒆𝒏𝒄𝒆, what remains uniquely valuable about being human? My answer is our lived 𝐞𝐱𝐩𝐞𝐫𝐢𝐞𝐧𝐜𝐞𝐬 and our 𝐜𝐮𝐫𝐢𝐨𝐬𝐢𝐭𝐲.
MTS is changing the org chart. OpenAI's Greg Brockman gave the origin story: "when starting OpenAI, we didn't want to bucket people into researchers & engineers. Alan Kay advised that they used 'Member of Technical Staff' at Xerox PARC, we loved & adopted it."
AI is reshaping the industry at an unprecedented pace. Many of the vectors — level, job family, tenure, knowledge, and skill sets — no longer matter the way they used to. Before that fully arrives, I want to set down what this decade on the corporate ladder actually taught me.
I recently reflected on my decade at Amazon. Whenever people learn I've stayed at one company for ten years, two questions always follow. The first: "What's it actually like?" The second, usually prefaced with a polite cough: "No offense — but why have you changed so many domains and teams?"
My half-joking answer is that my work density has simply run higher than my cohort's: I've spent the whole decade in startup mode, continuously shipping 0→1 products as an intrapreneur (the body of work tells that story better than I can). Five teams, five domains — but one throughline.
So this is an honest retrace of each team change — not a highlight reel, but a look at how the decisions were really made: what I weighed, what I got right, and what I only understood much later. The lens I keep coming back to, and the one I most want to share, is this:
| Dimension | The question I ask |
|---|---|
| Domain | Is this a domain I'm genuinely interested in — is the interest strong enough? |
| Motivated / Productivity | Does the work motivate me most, and will I be fully productive when working on it? |
| Learning curve | How much of this is genuinely new to me? Will I be a beginner again? |
| Scope | How large is the surface area I'd own — a component, a system, a business? |
| Leverage | Does my work compound? Who else benefits when I ship? |
| Impact | Does the work move a real needle — for customers, the business, or the field? |
| Environment | Who would I be surrounded by — the managers and colleagues? |
| Trust | How do I build trust with senior leaders, colleagues, direct reports, and cross-functional partners? |
| Research vs engineering | Am I discovering what's possible, or hardening what's known? |
| IC vs manager | Do I want to scale through my own hands, or through other people? |
| Short vs long term | Is this an optimization for now, or a bet on who I want to be in ten years? |
The compass underneath all of them: curiosity and interest.
No single move scores well on every axis. The interesting part is which ones you're willing to trade — and that changes as you grow. This first chapter is where I learned the lens existed at all.
A Snapshot of My Work at Amazon
The journey so far, at a glance — each row is a chapter in this series:
| Years | Domain | Team & Org | Role | Signature work |
|---|---|---|---|---|
| 2016–2019 | Fintech | AWS Payments · Commerce Platform | SDE I → SDE II | 10x scaling, GDPR, EMEA; zero-downtime relational→NoSQL migration (~$8.5M/yr saved) |
| 2019–2020 | Conversational AI | Alexa AI · Health & Wellness | L5 ML Engineer | Founding engineer; voice medication management (Nov 2019) + HIPAA skills (2020) |
| 2020–2023 | Virtualization & Cloud Native | AWS App Runner / Fargate · Containers | SDE II → SDE III | Founding engineer & Uber Tech Lead (UTL); App Runner 0→1 launch (2021); observability/tracing launch (2022) |
| 2023 | Foundational Data | Amazon S3 · AWS Storage | L6 / SDE III | Learning to operate inside a hyperscale, mature org |
| 2023–present | GenAI | Amazon Bedrock & SageMaker AI | L6 / ML Research Engineer | Founding engineer; Claude Day-0 launches, Trainium; model optimization (speculative decoding), customization, evaluation & benchmark |
Fintech — AWS Payments
'16–'19
My journey timeline — each chapter adds a domain; the hatched stretch is the future, not yet lived.
Technology cycles I've ridden — Chapter 1 sits on the maturing cloud-computing wave.
Day 1: AWS Payments
I joined Amazon on March 14, 2016 — Pi Day, and, fittingly, the day AWS turned ten. Fresh out of school, I was assigned to my first team: AWS Payments, part of the broader Commerce Platform organization.
If you've never thought about how AWS actually charges its customers, that machinery is exactly what Commerce Platform builds. Put simply: it is the platform that bills AWS customers, anchored by a monthly bill run that has to be correct, on time, and at enormous scale. It was a very typical internal platform team — unglamorous from the outside, deeply load-bearing from the inside. Mistakes here are measured in real dollars on real invoices. I came in as a software engineer (SDE I), and I stayed for three years.
If you're curious what the org is really about, VP James Greenfield gave a great public overview of the AWS Commerce Platform in a 2022 Screaming in the Cloud interview.
On the lens above, Payments scored high on scope and leverage in a way I didn't appreciate at first. The learning curve wasn't a new domain so much as a new bar: correctness, idempotency, reconciliation, the discipline of systems where "mostly right" is simply wrong. That bar shaped how I build to this day.
Years later — after I'd left the team — I finally read the popular book Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems (DDIA). I recognized, almost page by page, that the problems it describes were exactly the ones I'd been wrestling with day to day in the Payments invoicing domain. I'd been living the textbook before I read it.
In Q2 2018 I was promoted from SDE I to SDE II — the first real validation that I could own ambiguity, not just close tickets.
The Project That Closed the Chapter
The last project I led in Payments was the one I'm still proudest of: deprecating the legacy relational database that sat under all of Commerce Platform's business, and migrating its data — with zero downtime — onto a non-relational store, Amazon DynamoDB.
This was not a tidy project. It was a multi-year program spanning many teams across the organization, with a history of failed attempts before us. Migrating a system of record that money flows through, while it keeps serving live traffic, is the kind of work where the interesting engineering and the terrifying engineering are the same thing.
Our team became the first in the organization to move off that database. The outcomes were concrete:
- Estimated ~$8.5M/year in cost savings from tearing down the legacy relational database (MDB).
- Reduced the variance in P99 latency of invoice creation by ~10%.
Being first mattered beyond the metrics — it turned a stuck, "everyone-knows-we-should-but-no-one-has" program into something the rest of the org could follow. That is leverage: the work outlived my involvement in it.
At the end of Q1, once the launch announcement email went out, I did something that surprised even me. I told my manager I'd decided to change teams.
Why Leave on a High Note?
It's a fair question — why walk away right after the biggest win? But that's exactly the logic of the curiosity compass. I had just spent three years going from beginner to someone who could lead a flagship migration. The learning curve had flattened. I could see the shape of the next three years in Payments, and while it would be valuable, it would not make me a beginner again. I wanted to be a beginner again.
This was my first time changing teams, and I didn't fully know how to do it. So I did it the way an engineer does: I ran the experiment. As an SDE II, I interviewed with several internal teams and was fortunate to receive offers from a few:
Three offers, narrowing to one real fork — fundamentals vs. frontier.
The Real Choice: ECS vs. Alexa
IoT was interesting, but the decision quickly narrowed to two very different futures.
ECS pulled hard on the part of me that loves true fundamentals. It offered a path into low-level virtualization and the chance to contribute to open-source projects in the container ecosystem. On the lens, it scored beautifully on learning curve (bottom-of-the-stack systems I'd never built) and on community (open source as leverage that compounds far beyond one company). It was the "deepen the foundations" bet.
Alexa AI Health & Wellness pulled on something else entirely. It was a greenfield domain and team. When I talked to the hiring managers and engineers, what struck me was how genuinely excited they were about what they were going to build. It sat outside AWS — a chance to explore a completely new space — and it was riding the most exciting technology wave of the moment: NLU, applied AI, and the kinds of consumer applications that felt like science fiction becoming product. It was the "explore a new frontier" bet.
Fundamentals vs. frontier. Depth vs. breadth. Hardening what's known vs. discovering what's possible. The lens didn't hand me an answer — it sharpened the question.
So I did what I'd recommend to anyone facing a fork like this: I talked to tenured people who had seen many such forks, and I listened.
In the end, I chose Alexa AI — Health & Wellness. I wanted to step outside AWS and explore a brand-new space, and I was genuinely energized by NLU, AI applications, and the products that wave was about to make possible.
But here's the part I want to be honest about: choosing one path didn't extinguish the other. I kept my curiosity for open-source contribution and for the low-level virtualization fundamentals alive. I didn't know it yet, but that flame would shape later chapters of this very story. A choice is rarely a door closing — more often it's a door you leave ajar.
Lessons From the First Fork
- Leave when the curve flattens, not when the work runs out. There is always more work; there isn't always more learning.
- Finish before you leave. Ship the thing, send the email, then change teams. Endings earn you your next beginning.
- Talk to people who've walked the fork. Tenure is a cheap way to borrow hindsight.
- A decision isn't a deletion. The interests you don't choose can wait for you.
From Fintech's iron discipline to the open frontier of conversational AI — the largest learning curve I could find, on purpose.
Conversational AI — Alexa Health & Wellness
'16–'19
'19–'20
+ the NLP / conversational-AI wave — cresting around 2019–2020, right as I joined Alexa.
Zero to One, Inside a Giant
Alexa Health & Wellness was my first time building a brand-new product and carrying it from zero to one. It was a small, neat, startup-like team living inside one of the largest companies on Earth — which is its own peculiar gift: the energy of a garage with the reach of a global platform.
The team had formed quietly in 2018 within the Alexa organization (under a group called Alexa Domains), with a mandate to make the voice assistant genuinely useful in healthcare — diabetes management, care for new mothers and infants, and aging — all of it under the constraints of HIPAA (CNBC, May 2018). The team was led by Rachel Jiang, our org's leader. Its founding cast included Missy Krasner — who had built the healthcare business at Box and earlier worked on Google's first foray into health records — and a sharp product manager, Yvonne Chou, who would become a close colleague.
I joined as an L5 machine learning engineer, working shoulder to shoulder with L7 product leaders and closely with the applied scientist team. This was my first real taste of the research-meets-engineering seam: scientists probing what natural language understanding could do, engineers like me hardening those ideas into something a regulated, real-world product could ship. I'd later write a whole separate post on that tension — here, it was simply daily life.
I stayed about 1.5 years, and in that window we shipped two public releases:
My piece was the multi-turn speechlet for medication management, built on NLU — the back-and-forth logic that lets a person actually converse with Alexa about their medications rather than bark single commands. We partnered with a pharmacy and a hospital. The work was confidential enough that our office floor required special badge access — you knew the project mattered when the building itself treated it like a secret.
When Your Code Talks Back
There is a specific kind of joy in this work that I haven't quite found anywhere else: you connect an Echo device, run your build, and then your own code talks to you — answers, asks, clarifies — while you test and debug. Debugging a conversation is a strange and wonderful thing. The system stops feeling like software and starts feeling like a presence in the room.
In November 2019 we launched to the public — the same month Alexa itself celebrated its fifth birthday (November 6, 2019). The work was highlighted in Amazon's earnings results and picked up by media like Forbes and CNBC. It was a genuinely fulfilling stretch.
The most touching proof that any of it mattered came much later. In December 2020, long after I'd moved on, I was messaging with my former colleague — PM Yvonne Chou, by then a Director of Product and later Chief of Staff at AI2. She wrote (original, unedited):
"I am in Los Angeles for the holidays and my brother got my parents a new Echo Show earlier this month. When I got to my parents house, I noticed my brother was logging dirty diaper changes and my dad had medication reminders set up all on his own :) they obviously wasn't our MM ones, but it was fun to see them use the features the team built and I didn't even have to tell them about it.
Happy new year!
-Yvonne"
Strangers, in their own homes, quietly using the category of thing you helped invent — without anyone telling them to. That is the whole job, really.
Riding the Technology Cycle
Looking back, the real lesson of this chapter isn't about healthcare at all — it's about where you stand on the technology cycle. Every transformative capability arrives in waves. The dream of a machine you can simply talk to is decades old; what keeps changing is the engine underneath it. In 2019, that engine was still intent-and-slot NLU, and we were doing the heavy scaffolding the next wave would quietly abstract away.
The conversational-AI cycle. I shipped on the second rung; LLMs made the same experience feel effortless two rungs later.
Being early on the curve feels uncomfortably like being wrong: the product is harder to build, the experience is rougher, adoption lags. But early work is rarely wasted — it teaches you the problem so deeply that when the next wave lands, you already understand exactly what it's good for. The conversational-AI dream didn't fail. It was simply early, waiting for the LLM wave to make it effortless.
Why the Frontier Cooled
And yet. This was the pre-LLM world, and the gap between the dream and the technology was real. NLU back then still leaned on slots and rule-based design — the scaffolding of traditional conversational AI. Supporting genuinely context-heavy, multi-turn conversation took enormous engineering effort, and even then the experience wasn't as smooth or as forgiving as what today's LLM assistants make look effortless. Product adoption and revenue were, honestly, not yet ideal.
The wider weather was turning too. Big Tech's grand healthcare bets were cooling: Haven — the much-watched alliance of Amazon, Berkshire Hathaway, and JPMorgan, given its name in 2019 — disbanded in 2021, the same year Google dismantled its own health division. Closer to home, leaders began to depart — Missy Krasner left Amazon in October 2020 for a venture firm. I read the signal: my time in this particular space was coming to an end.
So I reached back for the interest I'd deliberately left ajar a year and a half earlier — virtualization and the fundamentals — and made it my next stop. The door I hadn't closed turned out to be exactly the one I walked through next.
Two Tickets That Redirected a Career
Worth noting — in 2019 Alexa AI was generous enough to sponsor me to attend two conferences, and neither of us could have known how much those tickets would bend the next few years of my career:
- KubeCon + CloudNativeCon 2019, San Diego — my first in-person immersion in containers and the cloud-native world. I felt the sheer impact and the raw vitality of the cloud-native community up close, and something in me recognized it as home.
- NeurIPS 2019, Vancouver (YVR) — near the end of the year, a chance to stand among academic researchers and feel, directly, where AI was heading next.
I was in Vancouver partly for a more mundane reason — H1B visa stamping — and I spent the trip happily planning my 2020 travels. Such is life: about two months later, the global pandemic began, and every one of those plans dissolved. But the two seeds those conferences planted — cloud native, and the pulse of AI research — would germinate anyway.
Lessons From the Second Fork
- Zero-to-one teaches you what no mature system can. Building the first version of something forces every assumption into the open.
- Internal platform → public product. Shifting from an internal platform team to a public product team rewired how I think about users — from SLAs and invoices to delight, trust, and a name customers actually know.
- Be early, but read the weather. Being right too soon — before the technology catches up — feels a lot like being wrong. Knowing when to stay and when to move is its own skill.
- Say yes to the conference, the talk, the detour. The highest-leverage moments rarely announce themselves; two sponsored trips reset my trajectory.
- The interest you kept alive becomes the path you take next. Curiosity compounds quietly, then pays out all at once.
Two chapters in, the pattern was clear: each move traded comfort for a steeper learning curve, and each time, a curiosity I'd refused to abandon lit the way to the next door.
Virtualization & Cloud Native — AWS App Runner & Fargate
'16–'19
'19–'20
'20–'23
+ the cloud-native wave — Kubernetes-era infrastructure, my containers years.
Choosing Essential Infrastructure
With the conversational-AI chapter closing, I finally followed the interest I'd kept alive since the very first fork: virtualization. But I was also deliberate about where. I wanted to land somewhere that was essential, must-have infrastructure — with stable profitability and strong cash flow. In any climate, the unglamorous-but-indispensable layer is the safest place to do ambitious work.
That pointed me straight at containers. I reached out to a manager I'd known, Mats Lanner, who led AWS Fargate — a flagship product of AWS, launched at re:Invent on November 29, 2017. Fargate was on a strong growth trajectory as a serverless container compute platform.
When I reconnected with Mats around July 2020, his reply was more exciting than I'd expected — it was the first time I heard about project "Fusion," the code name for what would become AWS App Runner:
"It's all goodness for us, as Elastic Beanstalk has joined our org and we are doing an exciting project with them this year. This is a S-team goal for us, targeting a re:Invent launch and we think it will change how customers think about running applications on AWS. The project is code named Fusion and builds on top of Fargate..
Sound interesting?"
He also recommended I talk to Archana Srikanta, a senior engineering leader and one of Fargate's founding engineers — someone I already admired (she'd be promoted to Principal Engineer in 2021). I'd watched her YouTube talks on Fargate before I ever joined. We spoke over Chime; she sketched the project at a high level, and we ended up geeking out about Rust. All of it sounded cool. So I joined.
Joining the Away Team: Project Fusion
I came on as a direct report to Mats (my L7 manager) on the cross-organization project "Fusion" — later launched publicly as AWS App Runner.
A lot of people worry about being assigned as an "away team" member embedded in another org — especially on day one. As a junior engineer, I felt that same uncertainty. But that worry was slowly replaced by new knowledge, new technology, new discussions. Doing the task in front of you well is the one certain path to dissolving uncertainty. In the end, the away team turned out to be one of the most rewarding bets of my career.
I mainly designed and authored the dataplane, partnered with Amit Gupta, a strong senior engineer from Elastic Beanstalk (he'd go on to make Principal Engineer in Q2 2022). Together we worked through a stack of hard problems:
- Single-digit-second auto scaling
- Multi-tenant request routing
- Cellular architecture
- Capacity management
- Load balancing
- Networking
App Runner also put me in close, daily orbit of three Principal Engineers — Archana Srikanta, Onur Filiz (later a Principal Software Engineer at Microsoft), and Amit — a technical bar higher than any I'd worked under before, and one that pulled my own engineering up fast.
The Containers org was also relatively small compared to giants like Devices or Commerce Platform. For a long time I heard it simply didn't hire at L4 — a deliberate way to keep the bar high, since we were genuinely working on low-level systems.
Along the way I got to explore a lot of new techniques and open-source projects — Kafka, Envoy Proxy, Protobuf, gRPC — tech stacks that weren't yet widely used across Amazon; the Envoy work also had me writing C++. (I'd go on to start the first Envoy Proxy community inside Amazon.)
We launched AWS App Runner on May 18, 2021 (AWS News). It was my first time incubating and launching an AWS public product from zero to one — something worldwide users could discover and use right inside the AWS Console.
Our VP, Deepak Singh, also spoke publicly about App Runner around the launch. Deepak was a leader I followed closely during my containers years — every Friday, end of day, he sent out weekly notes with his thoughts and observations, and I genuinely enjoyed reading them.
A Fork: Stay With the Product, or Return?
As an away-team contributor, I eventually hit the obvious question: where do I belong? Mats initially assigned me to help with AWS Fargate's dataplane and Firecracker microVM work, while I kept closing out post-launch items on App Runner.
Then, in a regular 1:1, App Runner's General Manager — L7 leader Prashant Prahlad (now a VP at Datadog) — extended an offer: stay in App Runner as a founding member and core builder.
The same shape of fork as before: the new product I'd built, or the team I'd originally joined.
I chose to stay: I formally joined App Runner as a core member of the team and moved to report to Prashant.
The Path to L6: Into Observability
The unavoidable next topic was my own growth. I'd been SDE II since 2018 — nearly three years. Prashant offered me a few projects from the 2021 OP1 roadmap, and I picked a public feature release in Observability. I chose it because I wanted to ship public features; I had no idea how far that single choice would carry me into industry exposure.
The task: give App Runner real observability. We already had metrics and logging; as a one-stop container compute platform, we wanted tracing too. It was a relatively new domain for me, and I spent a lot of time more or less single-handedly on the research and initial build — OpenTelemetry, AWS X-Ray, Grafana.
It pulled me into a remarkable orbit of people: Principal Engineer Jaana Dogan (who later returned to Google as a Principal Engineer), Michael Hausenblas, and Alolita Sharma — genuine experts and leaders in the space — and deeper into the CNCF community. My mentor at the time, Phil Estes (Principal Engineer on the containerd team), knew them all — and it was Phil who later recommended me for the CNCF Ambassador community. Late at night, I worked through the PRFAQ and the customer experience with our PM Akshay Ram.
I led the feature launch in Q1 2022 — and was promoted to SDE III (L6) at the same time.
Stepping Into the Spotlight
Then something I didn't expect. Adam Keller invited me onto the AWS Containers YouTube show "Containers from the Couch" to talk about the feature, and the work was highlighted in AWS Open Source News and Updates, #113 that May, and championed publicly by our L7 product leader Shubha Rao (who later joined Google as Director of Product Management, AI/ML). It was the first time I was visible to a public audience because of a product I'd built.
A small side episode: on the day of the podcast, my laptop died. I was anxious and messaged Adam; he messaged back that he'd open the show solo for the first few minutes and could wait for me to restart. To my relief, the laptop came back, I jumped on, rescued Adam from a solo show, and started sharing what we'd built.
From there it snowballed. Public exposure brought visibility — and responsibility. I worked with our GTM, Sales, and Product teams to bring the product to customers and users, co-presented at DockerCon with Inbal Shani (now CPO at Twilio; former CPO at GitHub; our former GM at AWS), and wrote AWS blog posts. I took an internal course and earned a public-speaking certification to represent and support Amazon at top-tier external conferences and events. It felt a little like being on a press tour. I even began getting book-authoring invitations from publishers like Packt, more than once. I summarized that year's work in a Medium post, in list form. It was, without question, a major boost to my career. I gained external connections through it, too.
That community recognition culminated in something I'm still proud of: in late 2022 the Linux Foundation appointed me a CNCF Ambassador. I was invited to the New Ambassador announcement at KubeCon Amsterdam 2023, where I finally met — in person — many of the high-profile builders I'd only ever watched on YouTube and podcasts in the open-source and cloud-native world.
Overwhelmed — and a Cooling-Off Period
Visibility has a gravity of its own, and for a while I let it pull me. After my L6 promotion I became a core engineer and a cross-cutting “uber” tech lead — a force multiplier across the App Runner organization (~70 people). At the peak I was overseeing the execution of more than ten project lines as tech lead, and mid-year I also picked up direct reports — both interns and full-time engineers. I literally kept a weekly breakdown of where my hours went. The role stretched well beyond code into deep cross-functional partnership: I authored PRFAQs, worked with sales and GTM, and ran a WBR / MBR (Weekly / Monthly Business Review) with the product team. Big scope, big visibility, the energy of an early-stage startup — and, honestly, overwhelming.
Where my weeks actually went, at the peak.
My calendar was “quadruple-booked” more often than not, and bandwidth — not effort — was the real constraint. So when Jason later asked whether I'd take on the Bar Raiser program to support more interviews, I said no. Protecting a reasonable time distribution had quietly become more important than adding another hat.
The org around me was shifting too. Several leaders I'd grown with departed — Inbal, Shubha, Onur, Akshay, Prashant — and a reorg landed. By then Compute Services had grown into the larger DECS org (Developers, Events, Containers, and Serverless), and App Runner was folded into AWS Lambda. We got a new L7 manager, Jason Woodlee, who'd spent years as a CTO at an East Coast startup. He brought a different culture, and it felt refreshing.
My interns accepted the return offers we'd extended, and I'd hired a new full-timer — by every external measure things were going well. But I sensed it might be time to stop. I raised the decision to leave with Jason. He wanted me to stay, and generously offered to work on my Principal Engineer path together, documentation and all. I declined.
The honest reason: the business value no longer felt proportional to the hours I was pouring in. I needed a balance, or a break. (That App Runner later settled into KTLO — keep-the-lights-on — mode only confirmed the instinct.)
A Word on Product Positioning
There's a running joke that AWS has "17 ways to run containers." It's funny because it's true — and it points at something real: finding the right position for a product matters as much as building it. App Runner, for all the engineering we were proud of, never found the adoption curve that its closest analog, Google Cloud Run, did.
That stings, because App Runner was the first AWS product I helped build from zero to one. But failure is the mother of success — you only collect the lesson if you actually believe that. Still, when I left, I was genuinely lost about where to go next.
Lessons From the Third Fork
- Bet on indispensable infrastructure. Essential, profitable, cash-generating layers are where ambitious work is safest and compounds longest.
- Away-team anxiety dissolves in the work. Uncertainty shrinks every time you ship the task in front of you.
- Adopt the tech the rest of the org hasn't yet. Envoy, gRPC, OpenTelemetry — being early inside a big company is its own kind of edge, and its own community.
- Build the product, then tell its story. Visibility followed the work, not the other way around — and the storytelling became a second skill.
- Know when to stop. Scope and visibility can outgrow the value they create; walking away at the right time is a skill of its own.
- … but don't stop too early. Honestly, I think I left App Runner a little too soon. Even as a founding member, there was still room to grow — and staying through my L6 promotion would have made building a solid L6 profile smoother. My early exit, plus an end-of-year team change, quietly dented my rating, and I didn't appreciate at the time how much that mattered. If I could choose again, I'd lean on the leverage I'd already built and let good momentum run until a real hard stop forced the move.
- Positioning is half the product. The best engineering still needs the right place in the lineup to win adoption.
Three chapters in, my career had stopped looking like a ladder and started looking like a series of deliberate forks — each a steeper climb, chosen on purpose.
Foundational Data — Amazon S3
'16–'19
'19–'20
'20–'23
'23
No new curve here — S3 sits at the very foundation of the cloud-computing wave: the deepest layer of an old one.
A Mistake I Had to Make
Lost and a little depleted, I made what I now consider a career mistake — and one I'd probably make again, for what it taught me. I joined Amazon S3.
If you use AWS at all, you know S3. It was the very first AWS product, launched on Pi Day in 2006 — the same calendar date I would later join Amazon — and it is still the foundational data service of the cloud, the largest-scale data service on Earth (500T+ objects, 200M+ requests/sec, exabytes of data, 123 AZs, 39 Regions — at its 20th birthday). At the time — and still today — it is led by VP Mai-Lan Tomsen Bukovec, and long before I joined I'd read a great deal of her writing, internal and external. She is a passionate and deeply capable senior leader — and one of the most prominent female and Asian leaders at AWS.
Learning From Mai-Lan
Mai-Lan has a famously well-formed body of theory and practice for running a mature organization — including a crisp, widely-circulated definition of the roles of Principal and senior IC leaders. Where Deepak Singh's much-loved Friday notes were a stream of informal thoughts and observations, Mai-Lan's was a fully shaped system.
She hosted monthly leadership learning sessions across the S3 org and encouraged every L6+ leader to attend and discuss. Her TA would prepare each session's topic — usually one of the Leadership Principles — along with supporting docs shared ahead of a genuinely formal discussion. I attended most of them, and spoke once myself — the topic that session was “Recognizing Critical Moments.” By coincidence, one of the other speakers that day was Bryan Liles — the Senior Principal Engineer (L8) at AWS, and previously a high-profile figure in the CNCF community — who had joined S3 the very same week I did. Watching a mature, VP-level leader articulate her point of view and deliberately grow younger leaders was enormously valuable to me.
The Mistake Underneath
For all that organizational polish, you still can't escape the fact that line managers vary in caliber. When I was choosing my new team and talking with the hiring manager, I failed to do the expectation-setting I should have done. I went from reporting to an L7 manager as a tech lead to a new team reporting to a junior line manager — a serious downgrade in both scope and position. At the same time, I overestimated the career maturity of that junior manager, who had just transitioned from an IC into a management role.
And I couldn't blame anyone but myself. The leverage and standing I'd had in App Runner were things I had earned there, and the move was my own decision. From the new manager's vantage point I was simply a newly promoted L6 IC — he had no reason to account for the unusual density of work and delivery I was used to. The combination was never going to fit.
A Quick Course Correction
I recognized the mismatch almost as soon as I arrived, and I moved fast. My former App Runner manager had, as it happened, moved into a different S3 division around the same time — so I moved to report to her again. That was the real starting point from which I began to understand how a hyperscale product organization actually works, in contrast to the smaller, startup-shaped world of App Runner.
Landing in S3 Index
The team I landed in was S3 Index. Our Director, Amy Therrien, laid the business out clearly in a re:Invent talk; in short, we work on reducing 503s — the slow-down and throttling errors customers can hit at scale. Worth noting: viewed as a standalone product, S3 Apex (API + Index) would rank among the top-20 revenue businesses at Amazon — and it is, of course, a tier-1 business for S3.
RocksDB, Meta, and a 10-Year-Old Database
Because of my background — and a recommendation from my manager, who knew it — I got a genuinely unique assignment: a cross-company collaboration with Meta, deep-diving the open-source LSM-tree database RocksDB (which, fittingly, turned ten that same year). We held weekly syncs with the Meta / RocksDB engineers, working in C++ — the same language as my Envoy days. Database engines weren't my home turf, but that was rather the point: it was a great thing to learn.
Our sister team in London had handed this part of the business over to Seattle, so we also met regularly with colleagues across London and Europe — a genuinely interesting cross-timezone collaboration.
One small marker of how alive this corner of the industry was: the RocksDB team's company, Rockset, was acquired by OpenAI on June 21, 2024.
2023: Reading the AI Wave
2023 turned out to be a special year. ChatGPT had landed on November 30, 2022, and the whole world began to stir. Having already lived through the conversational-AI hype once, I chose a deliberately observant, conservative stance — it might, I thought, be just another cycle peaking early, the way voice AI had.
What changed my mind was a signal from inside, not the headlines. Deepak Singh — by then VP of the broad DECS org (Developers, Events, Containers, and Serverless) — moved to report into Swami Sivasubramanian, VP of AI & Data, to take charge of AWS CodeWhisperer. (Swami did his PhD at Vrije Universiteit Amsterdam, mentored by AWS CTO Werner Vogels; one of Amazon's youngest VPs, he joined the company's S-team in September 2023.) When a leader of that scope re-points his own career at generative AI, it stops looking like hype. That was when I realized this wave was real — and that it was time to pay serious attention.
The Goal: Find a GenAI Team
From there the goal was clear: find a GenAI team to join. As always, it was a bet — and this one didn't go smoothly. Because the rise of GenAI was unplanned for everyone, the company included, Swami's org was in continuous reorg and resource-shifting mode. Most teams had frozen hiring until the dust settled, and when a team was short on people, the usual move was simply to shift engineers over from another product — early Bedrock, for instance, was staffed largely with engineers from Lex. And there was a long queue from outside the org, too: everyone wanted into this frontier domain.
For a while it felt like there was simply no way in for someone from outside that org, and I nearly gave up. But then a turning point appeared. (A rule I keep: when you're about to give up on something, give yourself three more months first.)
Three Offers
The turning point came as three offers, all in GenAI:
Three offers, all in GenAI.
The route there was a little unusual. I'd interviewed for Bedrock Fine-tuning and finished the informal loop, but the hiring manager came back to say the headcount had been filled. Rather than leave it there, he recommended me to two sister teams that did have open headcount — Bedrock Agent and Bedrock Inference — and because I'd already completed the informal loop, I didn't have to run it again for either.
And so the tradeoffs lined up again — three different GenAI bets, each with its own emphasis and its own hiring manager (an L6 (M2) at SageMaker AI Optimization, an L7 at Bedrock Agent, and an L6 (M1) at Bedrock Inference). After the scope lesson S3 had just taught me, reporting level was no longer an abstraction — it was one of the things I now weighed carefully.
I chose Bedrock Inference. My thinking was to enter at the bottom of the stack — start from the most foundational layer and build up. Looking back today, I'll be honest: I lacked prescience. I walked right past Bedrock Agent, which in 2025 was reformed into Swami's new Agentic AI org — and three years on, agentic AI has become one of the hottest practices in the entire LLM field. The fundamentals were the right instinct; I just didn't see how fast the frontier above them would move.
GenAI — Amazon Bedrock & SageMaker AI
'16–'19
'19–'20
'20–'23
'23
'23–
+ the LLM wave — near-zero until late 2022, then a vertical takeoff that's still climbing in 2026.
Engineer No. 2
The Inference team was newly funded when I joined. I was the first engineer hired, alongside one L6 engineer who'd come over from Lex — which made me, in effect, engineer No. 2. It was a privileged and confidential project: building the inference engine for Anthropic's models on Bedrock.
So, once again, I was a core founding engineer building from scratch — clean code, high visibility, fast pace. Another startup, this time inside the most important business at the company. And it grew like one: from 2 people to 50+ in a single year. (I wrote up that first year in more detail here.)
Building With Anthropic
We partnered directly with Anthropic — including co-founder Ben Mann — with 2+ working sessions a week and a monthly happy hour for each new model release. That work touched a number of ideas that have since become well-known standards across today's open-source inference community — disaggregated inference, multi-node inference, context-aware routing, and prompt caching.
Day-0 Launches & Trainium
Through 2024 I led almost all of Claude's Day-0 public releases on Bedrock — including Computer Use, Anthropic's first agentic-AI feature, on Claude 3.5 v2. I also delivered the first Claude model release on AWS Trainium/Inferentia — partnering with James Bradbury (Anthropic's Head of Compute) and the Annapurna teams.
Explosive Business Growth
"SemiAnalysis believes Bedrock is a $5.5B run rate business today with the vast majority of customers (80-90%+) using Anthropic models."
— SemiAnalysis
There's a strange and wonderful feeling — after all the hard work — in watching a business you're responsible for, one of the company's core businesses, grow at breakneck speed. And with that growth, the pressure multiplies.
I also watched Anthropic itself grow up close — from a team of maybe fewer than 200 people to what it is today, its valuation climbing from roughly $18B in 2024 to $965B by mid-2026 — more than a 50× jump that made it the world's most valuable startup. When I told people at CVPR 2024 that I was working with the Claude model, I'd often get a blank look — they didn't know what it was. Today I'm pretty sure everyone knows and uses Claude, and Anthropic is a household name worldwide.
Senior Leadership, Up Close
This was also the first time I truly felt how much Amazon's most senior leadership cares about a business. We received tickets directly from the CEOs of customer companies. Every Sev-2 review pulled in at least two L10-level leaders — our VP and our Distinguished Engineer — and I worked directly with L8 Senior Principal Engineers. It was my first time collaborating that closely with leaders at that altitude. High visibility, high pressure.
Because the work was so early and so rare, very few people anywhere had real, hands-on experience with frontier-model inference. Anthropic wasn't yet a household name, and the company guarded the program tightly — for P&C and IP-protection reasons (on Anthropic's behalf), even other Bedrock teammates couldn't access our codebase or documents.
I also got the chance to know the folks at Anthropic I worked with day to day, and to build connections with others in the same domain. Over time, a number of those connections turned into genuine relationships — investors, founders, and researchers I now count as part of my network. Some of the most valuable parts of this chapter weren't in the codebase at all; they were the people the work put me next to.
Choosing to Step Sideways
After shipping my last task in Q1 2025, I knew I wanted to stop. Back from a vacation, I booked a meeting and sat down with the two Senior Principal Engineers to tell them I wanted to try a different direction. They were understanding, and offered me a few options.
At the time, Bedrock Inference was organized into three divisions: 3P (Claude and other closed-source models), 2P (open-weights models like Llama, DeepSeek, and Qwen), and 1P (Amazon's own Nova and Titan models). I moved from 3P to 2P, deep-diving the open-weights world and starting my research in model optimization (speculative decoding) and customization — the line I've been working on ever since. I would do literature reviews, paper readings, and research into advanced methodologies, and apply them on open-weights models. Around the same time I also moved my office from Seattle, WA to Bellevue, WA — a smaller, quieter town compared to the city.
Across these years I also led the benchmark methodology setup across two companies and through multiple rounds of new-model public releases. Through that work, I gained deep experience optimizing and benchmarking both open-weight models and closed-source foundation models.
Speculative Decoding and a Loop in the Timeline
I was lucky enough to be appointed co-chair of NeurIPS 2025 Mexico City — going from attendee (2019) to organizer (2025).
Another loop in this story-timeline is on speculative decoding itself. The first time I ever heard the concept of speculative decoding was at PyTorch Conference 2024 in San Francisco, in a talk by Xiaoxuan Liu. I had no idea, sitting in that audience, that a year later I would be working on this very domain deeply — that it would become the line of research I lead today.
Just two weeks ago, at MLSys 2026 in Bellevue, while browsing the posters, I came across an oral paper titled “ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems.” Because it was directly related to the speculative-decoding RL work I lead at the company, I was fascinated — I spoke with one of the authors on-site, Qiaoling Chen, to learn more about the paper. I was then surprised to find a very familiar name among the authors: Tianwei Zhang (Nanyang Technological University) — Qiaoling's advisor. Remember the Alexa AI Health & Wellness team I mentioned back in Chapter 2? Tianwei was my officemate there. This may be the most wonderful coincidence I've encountered yet. When you discover, at a conference, that the advisor of an author whose paper touches your own line of work was once your former officemate — especially after you've moved across so many different domains — everything you find suddenly feels reasonable and natural. I also met Xiaoxuan there — she received the MLSys Best Research Paper Honorable Mention for her new paper “Speculative Decoding: Performance or Illusion?”
Where I Am Now
A decade, five chapters, one compass. From Fintech to conversational AI, from containers to the foundation of the cloud, and now to the frontier of generative AI — every move traded comfort for a steeper learning curve, and every time, a curiosity I refused to abandon lit the way to the next door. The decision lens got sharper; the bet never changed.
Career Priority
If I compress a decade of these decisions into one picture, a few forces explain almost every move. They aren't a single ladder — they form a flywheel, plus two independent axes (role and domain), all riding on top of the technology cycles above. And in the AI era the ground under all of them is shifting, so it's worth asking what still holds value and what doesn't.
The unit of the whole thing: more leverage wins scope and impact, which earns visibility, rating, job security, and a higher level & role — and a bigger role hands you more leverage to spend. But notice the axle at the center: the wheel only turns on trust. Impact converts to visibility, and visibility to scope, only because people trust what you say and trust you with more — and unlike most skills, trust is transferable across domains and doesn't get commoditized by AI. The wheel turns on it. And what gets it spinning in the first place is your own drive — how much the work motivates you, and how productive you are once you're in it: trust is the axle, motivation and productivity are the torque.
One connected engine: technology cycles (x‑axis) decide which domain is hot, and a hotter domain sits higher — more leverage, opportunities, and possibilities. Inside every domain spins the flywheel above. You climb by jumping to the next hot domain, and those jumps are powered by transferable knowledge (fundamentals, systems thinking, taste, learning velocity) — while AI‑commoditized, domain‑locked skills (rote syntax, memorized APIs, single tools, gatekeeping titles) stay behind. At any rung, your role can move three independent ways: up a level, across a stance (IC ↔ Manager), or into a different job family (increasingly one “Builder / MTS” in the AI era).
The honest tension: optimizing purely for the flywheel keeps you safe but narrow; following interest across domains widens the map but resets the loop a little each time you jump. Changing domains carries a real cost and risk — and because everything is in motion, the tradeoff never sits still; striking that dynamic balance is an art of its own. My whole career has been a bet on the domain axis and the left column above — and, as the next note shows, the flywheel mostly caught up anyway.
A Note on Promotions — and Ratings
For anyone keeping score, neither promotion came on the first try:
- SDE I → SDE II — failed once in Q1 2017, then succeeded in Q2 2018 (AWS Payments).
- SDE II → SDE III — failed once in Q4 2021, then succeeded in Q1 2022 (AWS App Runner).
And here's the part I'd tell anyone earlier on: even with two promotions, I never once received an "Exceeds" or top-tier annual rating internally. At some point you realize the rating doesn't mean all that much — the work, and what you become doing it, does.
One more wrinkle worth recording: during my promotion cycle, the company hard-required a Tech Promotion Assessment (TPA) as a supplement to the package. The norm was one TPA for an L5→L6 promo, and two TPAs for an L6→L7 promo. A TPA is essentially an independent third party who interviews the peers and managers around you and runs a qualifying review of your work. The interesting part: the assessor is normally someone at the target level. In my case, the L6 engineer who ran my TPA was later promoted to Principal Engineer — so, by luck, I effectively got my work assessed through a PE's lens.
And the funny epilogue: shortly after my promo quarter, the company dropped the TPA requirement for L6 entirely — though it's still required for L7 promos. The complaint was that it cost too much time and too many resources — a TPA typically spent months collecting data and artifacts and writing up a comprehensive report. It simply didn't scale as the company grew fast and added more people.
On the IC vs. Manager Question
Over these years, in every team I stayed on, a leader offered or invited me to make the move from IC to manager — at least three times. I even operated as a people manager for a short stretch. And yet, by now, I've still chosen the IC role — for now.
Early on, I was deeply impressed by Carlos Arguelles (Senior Principal SDET at Amazon) and his blog “Belonging to Amazon's Principal Engineering Community”, and by Tanya Reilly's book “The Staff Engineer's Path.” I was lucky enough to connect with both Carlos and Tanya in person. For a long time, I loved the picture they painted of the IC path. And at Amazon, the number of Principal Engineers is far smaller than people managers at the same level — which made the senior-IC track feel both rarer and more compelling. So I turned down most of the offers to become a manager.
That said, I did take the internal I2M (IC-to-Manager) courses. They gave me a lot of perspective — perspective that's genuinely helped me operate as a senior IC across the company. I still keep an open mind about both the IC and manager paths.
And with the current wave of technical innovation, I'm watching organizational structures change dramatically. I don't have an answer about whether IC or manager is the better path right now — but I'll keep observing.
Farewell Notes
A small personal archive — the goodbye note I sent each team as I moved on.
AWS Payments
Subject: Farewell CP & AWS!
Dear Bill Runners,
What a long time! It have been 3 years. As part of you already know, I am going to explore an unknown journey outside of aws in 2019. I joined CP right after my master graduation. And it is time for me to graduate from CP. I would like to thank you for your help and support along the way. You will be missed!
Grateful for all of lovely peers I worked with, you gave me a plenitude and meaningful three years. We have accomplished a lot together. Lots of incredible things happen: MDB Disaggregation, Consolidated Invoicing, EMEA CSOR, 10x, CICD.. It is an unforgettable experience working with such a talented group of people. I do enjoy it and learnt a lot.
I want to take the opportunity to say thank you to Martin Larricart, Jeff Zhang, Richard Rothstein and Weiyan Zhong for their continue support and coaching. You like lighthouse oversea that guide me correct direction on my career growth.
I will still be in Amazon. I wish you all the best in your future endeavors and hope to stay in touch.
Best Regards,
Yiming
Alexa Health & Wellness
Subject: Farewell Alexa health
Hi Team,
Today is my last day on Alexa health team. As part of you already know, I am going to back to aws for a new track in container and serverless area. Time flies, I joined Alexa health on 2019-3. And I have been in Amazon 4yr3mons. I always ask myself what things will look like if having chance to be a 5-year Amazonian one day. It is the hardest decision this tough year. Hope it will work out in the end.
Begin with the date I joined amazon, I start hearing alexa news. 2016, might be the fastest growth period of alexa. Voice assistant is such a cool product and unknown domain for me at that moment. I felt grateful meeting such an opportunity in 2019 unboxing it, joining it and participating in building it. I learnt a lot.
I would like to thank all of lovely peers I worked with. I still have a fresh memory how we launched medication management system from scratch in 2019. We have accomplished a lot together in this one half year: medication management, refill, hipaa self-service, a4hc, ehr auth session.. Team's energy and cohesion impressed me. I might never ever be able to find a team with such good atmosphere. I am so proud of being part of it. I will miss all of you guys.
I want to take the opportunity to say thank you to Litao, Haiyang. Appreciate for trust and all of opportunities. I did feel being valued. I also want to thanks all of leaderships in Alexa health org. Appreciate for constructing such a transparent, connected work space.
Again, thanks for having me on the team. I wish you all the best in your future endeavors.
I will still be in Amazon. My new office is couple blocks away from blueshift. Don't be a stranger, it is a small industry and let's keep in touch.
Linkedin: linkedin.com/in/pengyiming
Best regards,
Yiming
AWS App Runner / Containers
Dear friends,
Appreciate your time. Reach out here to share some recent updates. (Excuse me, I know, this is the “season”..🙁)
I will join another product team under Storage org in next week. It is the hardest decision to make. Especially leaving a product babysitting from 0 to 1 since its stealth mode. I love our teams, products and people here.
Back to Day1 when I was at Fargate, Mats reached out to me and we first-time talked about project “Fusion” (App Runner's code name):
“It's all goodness for us, as Elastic Beanstalk has joined our org and we are doing an exciting project with them this year. This is a S-team goal for us, targeting a re:Invent launch and we think it will change how customers think about running applications on AWS. The project is code named Fusion and builds on top of Fargate.. Sound interesting?”
It has been almost 3 years for me being in Elastic Containers org. Fargate on Firecracker, RoadRunner (now it is “Seekable OCI”), Fusion→App Runner GA, VPC, X-Ray, Route53, Private Service, WAF etc. It has been such a long non-stopped running. Now it is the time I plan to slow down a little bit and revisit next step in life as a new mid-age young man.
I am gratitude on all the journey and lovely people I have met and worked with in past few years.
Sincerely best wishes to all of my dear teammates and friends here. See you around, I will be around.. “Thanks” to Pandemic, I still haven't got chance yet meet with part of you in person. New office location is across the street away. Happy to grab a coffee together if there is an opportunity. Let's keep connection.
I am still passionate in Cloud-Native, Containers, Serverless and Open-Source. No plan to stop invest energy in these technical fields. Part of you might know, I am hosting an open-source community “CloudNative-Serverless-Meetup” on GitHub. Happy to keep in touch from there if you like it. Seats offered on the couch.😉
Never say never, it is a small world. Containers “Days” is enjoyable. GO ECS! GO App Runner! GO Beanstalk!
Best and sincerely,
Yiming
linkedin.com/in/pengyiming
Amazon S3
Follow the curiosity. The career follows.
“你的工作会占据生命中很大一部分,真正让人满足的唯一方式,是去做你认为伟大的工作;而做出伟大工作的唯一方式,是热爱你所做的事。”
—— 史蒂夫·乔布斯(Steve Jobs)
在 𝑨𝒓𝒕𝒊𝒇𝒊𝒄𝒊𝒂𝒍 𝑰𝒏𝒕𝒆𝒍𝒍𝒊𝒈𝒆𝒏𝒄𝒆 的时代,作为人类,我们身上还有什么是独一无二、不可替代的?我的答案是:我们亲历过的 𝐞𝐱𝐩𝐞𝐫𝐢𝐞𝐧𝐜𝐞𝐬(个人体验),以及我们的 𝐜𝐮𝐫𝐢𝐨𝐬𝐢𝐭𝐲(好奇心)。
MTS 正在改变组织架构。OpenAI 的 Greg Brockman 讲过它的由来:“创办 OpenAI 时,我们不想把人简单地分成研究员和工程师。Alan Kay 建议我们采用他们在施乐 PARC(Xerox PARC)用过的‘技术成员(Member of Technical Staff)’,我们很喜欢,便沿用了下来。”
AI 正以前所未有的速度重塑整个行业。许多过去的衡量维度——职级、岗位族、司龄、知识储备、技能组合——已不再像从前那样重要。趁它尚未完全到来,我想把这十年在职级阶梯上真正学到的东西记录下来。
最近我回顾了自己在亚马逊的十年。每当有人得知我在同一家公司待了十年,总会跟着两个问题。第一个:“那到底是种什么体验?” 第二个,通常先轻咳一声作铺垫:“无意冒犯——可你为什么换了这么多领域和团队?”
我半开玩笑的回答是:我的工作密度就是比同期的人更高——整整十年我都在以创业模式奔跑,作为一名内部创业者(intrapreneur)持续交付从 0 到 1 的产品(我的作品集比我自己更会讲这个故事)。五个团队,五个领域——但只有一条主线。
所以这是一次对每一次换组的诚实回溯——不是高光集锦,而是看清这些决定究竟是怎么做出来的:我权衡了什么、哪些做对了、哪些是很久以后才想明白的。我反复回到、也最想分享的那套“透镜”,是这样的:
| 维度 | 我会问自己的问题 |
|---|---|
| 领域 Domain | 这是我真正感兴趣的领域吗——兴趣足够强吗? |
| 动力 / 生产力 Motivated / Productivity | 这份工作最能激发我的动力吗——投入其中时我能全力发挥生产力吗? |
| 学习曲线 Learning curve | 这里有多少东西对我是全新的?我会重新变回一个新手吗? |
| 范围 Scope | 我能拥有的“面”有多大——一个组件、一个系统,还是一项业务? |
| 杠杆 Leverage | 我的工作会复利吗?我交付时,还有谁会因此受益? |
| 影响力 Impact | 这份工作能真正撬动什么吗——对客户、业务,还是这个领域? |
| 环境 Environment | 我会被怎样的人包围——经理与同事? |
| 信任 Trust | 我该如何与高层领导、同事、下属以及跨职能伙伴建立信任? |
| 研究 vs 工程 | 我是在探索“可能性”,还是在把“已知”做扎实? |
| IC vs 管理 | 我想靠自己的双手扩展影响,还是通过他人? |
| 短期 vs 长期 | 这是对当下的优化,还是对“十年后我想成为谁”的下注? |
所有维度之下的那只指南针:好奇心与兴趣。
没有哪次选择能在每一个维度上都得高分。有意思的是你愿意拿什么去换——而这会随着成长而改变。正是第一章,让我意识到这套透镜原来一直存在。
我在亚马逊工作的全景速览
到目前为止的旅程,一图概览——每一行都是本系列中的一章:
| 年份 | 领域 | 团队 & 组织 | 角色 | 代表性工作 |
|---|---|---|---|---|
| 2016–2019 | 金融科技 Fintech | AWS Payments · Commerce Platform | SDE I → SDE II | 10 倍扩展、GDPR、EMEA;零停机的关系型→NoSQL 迁移(约每年节省 850 万美元) |
| 2019–2020 | 对话式 AI | Alexa AI · Health & Wellness | L5 机器学习工程师 | 创始工程师;语音用药管理(2019/11)+ HIPAA 技能(2020) |
| 2020–2023 | 虚拟化 & 云原生 | AWS App Runner / Fargate · Containers | SDE II → SDE III | 创始工程师 & Uber Tech Lead(UTL);App Runner 0→1 发布(2021);可观测性/链路追踪发布(2022) |
| 2023 | 基础数据服务 | Amazon S3 · AWS Storage | L6 / SDE III | 学习如何在一个超大规模、成熟的组织里运转 |
| 2023–至今 | 生成式 AI | Amazon Bedrock & SageMaker AI | L6 / ML Research Engineer | 创始工程师;Claude Day-0 首发、Trainium;模型优化(投机解码)、定制化、评测与基准 |
金融科技 Fintech — AWS Payments
'16–'19
我的职业时间线——每一章都加入一个新领域;斜纹部分是尚未走过的未来。
我亲历的技术周期——第 1 章正处在逐渐成熟的云计算浪潮上。
第一天:AWS Payments
我于 2016 年 3 月 14 日加入亚马逊——那天是圆周率日(Pi Day),也恰好是 AWS 满十周年的日子。刚从学校毕业,我被分配到第一个团队:AWS Payments,隶属更大的 Commerce Platform(商务平台) 组织。
如果你从没想过 AWS 究竟是怎么向客户收费的——那套机器正是 Commerce Platform 在搭建。简单说:它就是给 AWS 客户开账单的平台,核心是每月一次、必须准确、准时、且在巨大规模下运行的“账单运行(bill run)”。这是个非常典型的内部平台团队——外表平淡,内里却极其承重。这里的错误是以真实账单上的真金白银来计量的。我以软件工程师(SDE I)的身份加入,待了三年。
如果你好奇这个组织究竟在做什么,副总裁 James Greenfield 在一期 2022 年的 Screaming in the Cloud 播客里,对 AWS Commerce Platform 做过很好的公开介绍。
用上面那套透镜看,Payments 在范围和杠杆上的得分很高,只是我当时没意识到。它的学习曲线与其说是一个新领域,不如说是一条新的标准线:正确性、幂等性、对账,以及那种“大致正确”就等于“错误”的系统纪律。这条标准线塑造了我至今的构建方式。
在我离开这个团队之后,我终于读了那本广受欢迎的书《设计数据密集型应用》(Designing Data-Intensive Applications,简称 DDIA)。我几乎逐页地认出:书中描述的那些问题,正是我当年在 Payments 开票领域里日复一日搏斗的东西。原来在读到这本“教科书”之前,我早已活在其中。
2018 年 Q2,我从 SDE I 晋升到 SDE II——第一次真正被证明:我能驾驭模糊性,而不只是关掉工单。
为这一章画下句号的项目
我在 Payments 主导的最后一个项目,也是我至今最自豪的:下线那套支撑着 Commerce Platform 全部业务的遗留关系型数据库,并把它的数据——以零停机的方式——迁移到一个非关系型存储 Amazon DynamoDB 上。
这不是一个干净利落的项目。它是一个横跨组织内众多团队、历时数年的大型计划,在我们之前已有过几次失败的尝试。在资金流经的“记录系统”仍在承载线上流量时去迁移它——这类工作里,最有意思的工程和最令人胆寒的工程,往往是同一件事。
我们团队成为组织里第一个迁出那套数据库的团队。成果是具体的:
- 拆除遗留关系型数据库(MDB)后,预计每年节省约 850 万美元成本。
- 把开票(invoice creation)的 P99 时延方差降低了约 10%。
“第一个”的意义超出了指标本身——它把一个卡住的、“人人都知道该做却没人做”的计划,变成了组织其他团队可以跟随的路径。这就是杠杆:工作的影响超越了我个人的参与时长。
在 Q1 末,发布公告邮件发出之后,我做了一件连自己都意外的事:我告诉我的经理,我决定换团队。
为什么要在高光时刻离开?
这是个合理的问题——为什么在最大的胜利之后就抽身?但这恰恰是“好奇心指南针”的逻辑。我刚用三年时间,从一个新手变成了能主导一次旗舰级迁移的人。学习曲线已经变平。我能看见接下来三年在 Payments 的大致形状——它仍然有价值,但不会再让我重新成为一个新手。而我,想再做一次新手。
这是我第一次换团队,我并不完全知道该怎么做。于是我用工程师的方式来做:我跑了个实验。作为 SDE II,我面试了几个内部团队,并幸运地拿到了几个 offer:
三个 offer,收敛成一个真正的岔路口——基础设施 vs. 人工智能。
真正的抉择:ECS vs. Alexa
IoT 很有意思,但抉择很快收敛为两种截然不同的未来。
ECS 牵动着我热爱真正的基础设施 primitive(底层构件)的那一面。它给我一条通往底层虚拟化的路,以及在容器生态里为开源项目做贡献的机会。在透镜上,它在学习曲线(我从未构建过的栈底系统)和环境/社区(开源是一种能远超单一公司、不断复利的杠杆)上都极其出色。这是“把根基扎得更深”的下注。
Alexa AI Health & Wellness 牵动的则完全是另一面。它是一个全新(greenfield)的领域和团队。当我和招聘经理、工程师们交谈时,最打动我的是他们对将要构建之物那份发自内心的兴奋。它在 AWS 之外——一个探索全新空间的机会——并且正搭乘当时最激动人心的技术浪潮:NLU、应用型 AI,以及那些让科幻照进产品的消费级应用。这是“去探索新前沿”的下注。
基础设施 vs. 人工智能。深度 vs. 广度。把已知做扎实 vs. 探索可能性。透镜没有直接给我答案——它让问题更锋利。
于是我做了我会推荐给任何面临这种岔路口的人去做的事:我找了那些见过许多此类岔路的资深前辈,认真听他们说。
最终,我选择了 Alexa AI — Health & Wellness。我想走出 AWS、探索一个全新的空间,也确实被 NLU、AI 应用,以及那波浪潮即将催生的产品深深点燃。
但我想诚实地讲清楚一点:选择一条路,并没有熄灭另一条。我把对开源贡献和底层虚拟化基础设施的好奇心一直留着。当时我还不知道,正是这簇火苗,会在这个故事后面的章节里再次发光。一个选择很少是“关上一扇门”——更多时候,是你留了一扇虚掩的门。
第一个岔路口的收获
- 在曲线变平时离开,而不是在活干完时。 活永远干不完;可学的东西却未必一直有。
- 先收尾,再离开。 把东西交付掉、把邮件发出去,然后再换组。漂亮的收尾,会为你挣来下一个开端。
- 去找走过这个岔路的人聊。 阅历,是借用“后见之明”的高效方式。
- 一个决定不是一次删除。 你没选的那些兴趣,可以等你。
从金融科技那种铁一般的纪律,到对话式 AI 那片开阔的前沿——这是我能找到的、最陡峭的学习曲线,且是我有意为之。
对话式 AI — Alexa Health & Wellness
'16–'19
'19–'20
+ NLP / 对话式 AI 浪潮——大约在 2019–2020 年达到顶峰,恰好是我加入 Alexa 的时候。
从 0 到 1,在一家巨头内部
Alexa Health & Wellness 是我第一次从 0 到 1 构建一个全新产品。它是一个小而精、像创业公司一样的团队,却长在地球上最大的公司之一里面——这本身就是一种奇特的馈赠:车库般的能量,配上全球平台的触达。
这个团队在 2018 年于 Alexa 组织内悄然成立(隶属一个叫 Alexa Domains 的小组),使命是让语音助手在医疗健康领域真正有用——糖尿病管理、对新生儿母婴的照护、以及养老——并且全部要在 HIPAA 的约束之下 (CNBC,2018 年 5 月)。 我们的部门由 Rachel Jiang 领导。其创始阵容里有 Missy Krasner——她在 Box 一手建起医疗业务,更早还参与过谷歌最初的健康病历尝试——以及一位敏锐的产品经理 Yvonne Chou,后来成了我亲密的同事。
我以 L5 机器学习工程师的身份加入,与 L7 产品负责人并肩,并与应用科学家团队紧密协作。这是我第一次真正尝到研究与工程交汇的那道缝:科学家在探究 NLU 能做到什么,而像我这样的工程师,把这些想法打磨成一个受监管的、能真正上线的现实产品。后来我专门写过一篇讲这种张力的文章——而在这里,它只是日常。
我待了大约 1.5 年,在这段时间里我们交付了两次公开发布:
我负责的部分是用药管理的多轮 speechlet,构建在 NLU 之上——也就是那套来回往复的对话逻辑,让人能真正与 Alexa 谈论自己的用药,而不是只会发单条命令。我们与一家药房和一家医院合作。这个项目机密到我们办公楼层需要特殊门禁才能进入——当大楼本身都把它当作秘密对待时,你就知道它有多重要。
当你的代码开口回话
这份工作里有一种我几乎在别处都找不到的特别快乐:你接上一台 Echo 设备,跑起你的构建,然后你自己的代码就开口和你说话了——回答、提问、澄清——就在你测试和调试的当下。调试一段对话是件既奇怪又奇妙的事。系统不再像软件,而开始像房间里的一个“存在”。
2019 年 11 月,我们向公众发布——同一个月,Alexa 本身也迎来了五岁生日(2019 年 11 月 6 日)。这项工作出现在亚马逊的财报里,也被 Forbes、CNBC 等媒体报道。那是一段真正有成就感的时光。
而这一切是否有意义,最动人的证据来得要晚得多。2020 年 12 月,在我早已离开很久之后,我和昔日同事——产品经理 Yvonne Chou(当时已是产品总监,后来成为 AI2 的 Chief of Staff)发消息时,她写道(原文,未作修改):
"I am in Los Angeles for the holidays and my brother got my parents a new Echo Show earlier this month. When I got to my parents house, I noticed my brother was logging dirty diaper changes and my dad had medication reminders set up all on his own :) they obviously wasn't our MM ones, but it was fun to see them use the features the team built and I didn't even have to tell them about it.
Happy new year!
-Yvonne"
陌生人,在他们自己的家里,悄悄地用着你参与发明的那一类东西——没有任何人去告诉他们。这,其实就是这份工作的全部意义。
顺着技术周期而行
回头看,这一章真正的教训其实根本不是关于医疗的——而是关于你站在技术周期的哪个位置。每一项变革性能力都是一波一波到来的。“能和机器直接说话”的梦想已有几十年;不断变化的,是它底下的那台引擎。2019 年,那台引擎还是“意图+槽位”的 NLU,而我们在做的,是下一波浪潮会悄悄抽象掉的那些重活、那些脚手架。
对话式 AI 的周期。我交付在第二级台阶上;两级之后,LLM 让同样的体验变得轻而易举。
站在曲线的过早处,感觉很像“做错了”:产品更难做、体验更糙、采用更慢。但早期的工作很少是白费的——它让你把问题理解得足够深,以至于下一波浪潮到来时,你已经清楚它到底擅长什么。对话式 AI 的梦想并没有失败。它只是来早了,在等 LLM 那波浪潮让它变得轻松。
前沿为何降温
然而。那还是 LLM 之前的世界,梦想与技术之间的鸿沟是真实存在的。当时的 NLU 仍然依赖槽位与基于规则的设计——传统对话式 AI 的脚手架。要支撑真正重上下文的多轮对话,需要海量工程投入,而即便如此,体验也不像今天的 LLM 助手那样顺滑、那样宽容。说实话,产品的采用和营收,还远谈不上理想。
更大的气候也在转变。大厂在医疗上的宏大押注正在降温:Haven——亚马逊、伯克希尔·哈撒韦与摩根大通备受瞩目的联盟,2019 年才正式命名——在2021 年解散,同年谷歌也拆掉了自己的健康部门。离我更近的,是领导者们开始离开——Missy Krasner 于 2020 年 10 月离开亚马逊,加入一家风投。我读懂了这个信号:我在这个特定空间里的时间,正走向尾声。
于是我伸手去够那个一年半前被我有意虚掩着的兴趣——虚拟化与基础设施——把它作为下一站。那扇我没关上的门,恰好就是我接下来走进的那一扇。
两张改写了职业生涯的门票
值得一提的是——2019 年,Alexa AI 很慷慨地资助我参加了两场会议,而当时谁都没想到,这两张票会把我接下来几年的职业生涯掰弯多少:
- KubeCon + CloudNativeCon 2019,圣地亚哥——我第一次线下沉浸进容器与云原生的世界。我近距离感受到云原生社区那种纯粹的冲击力与旺盛的生命力,内心某处认出了:这就是家。
- NeurIPS 2019,温哥华(YVR)——临近年末,一个置身学术研究者之中、直接感受 AI 下一步走向的机会。
我在温哥华,还有一个更平淡的原因——H1B 签证——我在那趟旅程里愉快地规划着 2020 年的出行。生活就是这样:大约两个月后,全球疫情开始,那些计划无一幸免地化为泡影。但那两场会议种下的两颗种子——云原生,以及 AI 研究的脉搏——终究还是发了芽。
虚拟化 & 云原生 — AWS App Runner & Fargate
'16–'19
'19–'20
'20–'23
+ 云原生浪潮——Kubernetes 时代的基础设施,我的容器岁月。
选择“刚需”基础设施
随着对话式 AI 这一章落幕,我终于追上了那个从第一个岔路口起就一直留着的兴趣:虚拟化。但我对去哪里也很有讲究。我想落在一个属于刚需、不可或缺的基础设施的地方——有稳定的盈利和强劲的现金流。无论行业是冷是热,那一层“不光鲜但不可或缺”的东西,是做雄心勃勃工作的最安全之处。
这把我直接指向了容器。我联系了一位我认识的经理 Mats Lanner,他领导着 AWS Fargate——AWS 的一款明星产品,于 2017 年 11 月 29 日的 re:Invent 上发布。当时 Fargate 作为无服务器容器计算平台,正处在强劲的增长轨道上。
当我在 2020 年 7 月左右重新联系上 Mats 时,他的回复比我预期的更令人兴奋——那是我第一次听说代号 “Fusion” 的项目,也就是后来的 AWS App Runner:
"It's all goodness for us, as Elastic Beanstalk has joined our org and we are doing an exciting project with them this year. This is a S-team goal for us, targeting a re:Invent launch and we think it will change how customers think about running applications on AWS. The project is code named Fusion and builds on top of Fargate..
Sound interesting?"
他还推荐我去找 Archana Srikanta 聊——一位资深工程负责人、Fargate 的创始工程师之一,也是我早已敬佩的人(她将在 2021 年晋升为 Principal Engineer)。在加入之前,我就看过她讲 Fargate 的 YouTube 演讲。我们用 Chime 通了话;她高层次地勾勒了这个项目,最后我们还聊起了 Rust。一切听上去都很酷。于是我加入了。
加入“客场团队”:Fusion 项目
我作为 Mats(我的 L7 经理)的直接下属,加入了跨组织项目 “Fusion”——后来以 AWS App Runner 之名公开发布。
很多人会担心被指派为嵌入另一个组织的“客场团队(away team)”成员——尤其是第一天。作为一名初级工程师,我也有同样的不确定感。但那份担忧,被新的知识、新的技术、新的讨论慢慢取代了。把眼前的任务做好,是消解不确定性最确定的那条路。 到头来,这支客场团队成了我职业生涯中最有回报的押注之一。
我主要负责设计和编写数据面(dataplane),搭档是来自 Elastic Beanstalk 的资深工程师 Amit Gupta(他后来在 2022 年 Q2 晋升为 Principal Engineer)。我们一起啃下了一摞硬骨头:
- 个位数秒级的自动扩缩容
- 多租户请求路由
- 蜂窝式架构(cellular architecture)
- 容量管理
- 负载均衡
- 网络
App Runner 还让我每天近距离处在三位 Principal Engineer 的轨道上——Archana Srikanta、Onur Filiz(后来成为微软的 Principal Software Engineer),以及 Amit——那是一条比我以往共事过的都更高的技术标准线,也把我自己的工程能力迅速往上拽。
与 Devices 或 Commerce Platform 这样的巨型组织相比,容器组织也相对小。很长一段时间我都听说,它干脆不招 L4——一种刻意保持高门槛的方式,因为我们做的确实是底层系统。
一路上我得以探索了大量新技术与开源项目——Kafka、Envoy Proxy、Protobuf、gRPC——这些在亚马逊当时还没被广泛使用的技术栈;Envoy 的工作还让我写起了 C++。(我后来在亚马逊内部发起了第一个 Envoy Proxy 社区。)
我们在 2021 年 5 月 18 日发布了 AWS App Runner (AWS News)。 这是我第一次把一款 AWS 公开产品从 0 孵化到 1 并发布——全世界的用户都能在 AWS 控制台里发现并使用它。
我们的副总裁 Deepak Singh 也在发布前后 公开谈到了 App Runner。 在我的容器岁月里,Deepak 是我一直关注的一位领导者——每周五下班时,他都会发出一封周记,分享他的想法与观察,我是真心喜欢读。
一个岔路口:留在产品,还是回去?
作为客场团队的贡献者,我最终撞上了那个绕不开的问题:我究竟属于哪里?Mats 最初安排我去帮 AWS Fargate 的数据面和 Firecracker microVM 工作,同时我也在继续收尾 App Runner 的发布后事项。
然后,在一次例行 1:1 上,App Runner 的总经理——L7 领导 Prashant Prahlad(现为 Datadog 的副总裁)——向我抛出了橄榄枝:留在 App Runner,做创始成员与核心构建者。
与之前同样形状的岔路口:我亲手建起的新产品,还是我最初加入的团队。
我选择留下:我正式加入 App Runner,成为团队的核心成员,并转去向 Prashant 汇报。
通往 L6 之路:走进可观测性
绕不开的下一个话题,是我自己的成长。我从 2018 年起就是 SDE II——已经快三年。Prashant 从 2021 年 OP1 路线图里给了我几个项目可选,我挑了一个关于可观测性(Observability)的公开功能发布。我选它,是因为我想做面向公众的功能;当时我完全没想到,这一个选择会把我带进多深的行业曝光里。
任务是:给 App Runner 真正的可观测性。我们已经有了指标和日志;作为一站式容器计算平台,我们还想要链路追踪(tracing)。这对我是个相对新的领域,我几乎是单枪匹马地花了大量时间做研究和初期构建——OpenTelemetry、AWS X-Ray、Grafana。
它把我带进了一群了不起的人的轨道:Principal Engineer Jaana Dogan(后来回到谷歌任 Principal Engineer)、Michael Hausenblas、Alolita Sharma——这个领域里真正的专家和领袖——也让我更深地进入了 CNCF 社区。我当时的导师 Phil Estes(containerd 团队的 Principal Engineer)和他们都熟——也正是 Phil 后来推荐我加入了 CNCF Ambassador 社区。深夜里,我和我们的产品经理 Akshay Ram 一起打磨 PRFAQ 和客户体验。
我在 2022 年 Q1 主导了这个功能的发布——并在同一时间晋升为 SDE III(L6)。
走入聚光灯
接着发生了一件我没预料到的事。Adam Keller 邀请我上 AWS Containers 的 YouTube 节目 “Containers from the Couch” 聊这个功能,这项工作还被 AWS Open Source News and Updates 第 #113 期 在那年 5 月收录,并由我们的 L7 产品负责人 Shubha Rao(后来加入谷歌任 Director of Product Management, AI/ML)在公开场合力荐。这是我第一次因为自己构建的产品而被公众看见。
一个小插曲:录播客那天,我的笔记本电脑罢工了。我很慌,给 Adam 发消息;他回我说,他会先独自开场几分钟,可以等我重启。有惊无险的是,电脑恢复了,我赶上线,把 Adam 从“独角戏”里解救出来,开始分享我们做的东西。
从那以后就一发不可收拾。公开曝光带来了能见度——也带来了责任。我和我们的 GTM、销售、产品团队合作,把产品带给客户和用户,在 DockerCon 上与 Inbal Shani(现为 Twilio 的 CPO;曾任 GitHub 的 CPO;曾是我们在 AWS 的总经理)同台演讲,还写了 AWS 博客。我上了一门内部课程并拿到了公开演讲认证,去代表并支持亚马逊出席顶级的外部会议和活动。那感觉有点像在“出通告”。我甚至开始收到出版社(比如 Packt)的写书邀约,还不止一次。我把那一年的工作总结写进了一篇 Medium 文章,用清单的形式记下。毫无疑问,这对我的职业生涯是一次巨大的助推。我也借此收获了不少外部连接。
那份社区认可,最终化作了一件我至今仍引以为豪的事:2022 年末,Linux 基金会任命我为 CNCF Ambassador。 我受邀参加了 KubeCon 阿姆斯特丹 2023 上的新任 Ambassador 公布活动,在那里,我第一次当面见到许多此前只在 YouTube 和播客上看过的、开源与云原生世界里的高知名度构建者。
不堪重负——以及一段冷静期
能见度本身有一种引力,有一阵子我任由它把我拽着走。L6 晋升之后,我成了核心工程师,也成了横跨整个 App Runner 组织(约 70 人)的“uber” tech lead——一个力量倍增器(force multiplier)。高峰时,我作为 tech lead 同时盯着十多条项目线的推进,年中我还开始带直接下属——既有实习生,也有正式员工。我真的逐周记录了自己的时间都花在哪。这个角色远远超出了写代码,深入到跨职能协作:我写 PRFAQ、和销售与 GTM 合作、还和产品团队一起开 WBR / MBR(周度 / 月度业务回顾)。大范围、大能见度、早期创业公司般的能量——而说实话,也让人不堪重负。
高峰期,我的每一周实际去了哪里。
我的日程常常是“四重预订(quadruple-booked)”,真正的约束是带宽——而不是努力程度。所以当 Jason 后来问我,要不要加入 Bar Raiser 项目去支持更多面试时,我说了不。守住一个合理的时间分配,已经悄然变得比再多戴一顶帽子更重要。
我周围的组织也在变动。几位与我一同成长的领导者相继离开——Inbal、Shubha、Onur、Akshay、Prashant——一次重组落地了。那时 Compute Services 已经扩张成更大的 DECS 组织(Developers、Events、Containers、Serverless),App Runner 被并入了 AWS Lambda。我们来了一位新的 L7 经理 Jason Woodlee,他此前多年在一家东海岸创业公司担任 CTO。他带来了一种不同的文化,让我觉得耳目一新。
我们发出的 return offer 被实习生们接受了,我也招进了一名新的正式员工——从外部的任何指标看,一切都很顺。但我隐约觉得,也许是时候停下来了。我把离开的决定告诉了 Jason。他希望我留下,并很慷慨地提出要和我一起规划我的 Principal Engineer 之路,连文档都帮我准备。我婉拒了。
诚实的原因是:业务价值已不再与我投入的时间成正比。我需要一种平衡,或者一段休整。(App Runner 后来进入 KTLO——“维持运转(keep-the-lights-on)”——模式,更印证了这种直觉。)
关于产品定位的一点话
有个流传很广的玩笑:AWS 有 “17 种运行容器的方式”。它好笑,是因为它是真的——而它指向了一个真实的道理:为产品找到正确的定位,和构建产品本身同样重要。App Runner,尽管我们为其中的工程感到骄傲,却始终没能走出它最接近的对标产品——Google Cloud Run——那样的采用曲线。
这有点刺痛,因为 App Runner 是我参与从 0 到 1 构建的第一款 AWS 产品。但失败是成功之母——只有你真的相信这句话,才收得到那份教训。不过,当我离开时,我确实对接下来该去哪里感到迷茫。
第三个岔路口的收获
- 押注不可或缺的基础设施。 刚需、盈利、能产生现金流的那几层,是雄心勃勃的工作最安全、也复利最久的地方。
- 客场团队的焦虑会在工作中消散。 每当你把眼前的任务交付掉,不确定性就缩小一分。
- 采用组织里其他人还没用的技术。 Envoy、gRPC、OpenTelemetry——在一家大公司内部做先行者,本身就是一种优势,也是一种社区。
- 先把产品做出来,再去讲它的故事。 能见度是跟在工作后面来的,而不是反过来——讲故事,成了我的第二项技能。
- 知道何时停下。 范围和能见度,可能会超出它们所创造的价值;在对的时间抽身,本身就是一种技能。
- ……但也别停得太早。 说实话,我觉得自己离开 App Runner 离得有点早。即便作为创始成员,仍有成长空间——若能熬过 L6 晋升再走,构建一个扎实的 L6 履历会更顺。我的提前退出,加上年末换组,悄悄拉低了我的评级,而当时我并没意识到这有多重要。如果能重来,我会更多地依靠自己已经攒下的杠杆,让好的势头继续跑,直到一个真正的“硬停”逼我必须动为止。
- 定位是产品的一半。 再好的工程,也仍需要在产品矩阵里占到正确的位置,才能赢得采用。
三章过去,我的职业生涯不再像一架梯子,而更像一连串刻意选择的岔路口——每一次都是更陡的攀登,且都是有意为之。
基础数据服务 — Amazon S3
'16–'19
'19–'20
'20–'23
'23
这里没有新曲线——S3 坐落在云计算浪潮最底层的地基上:一条旧浪潮里最深的那一层。
一个我必须犯的错
迷茫,又有些耗竭,我做了一件我如今视为职业生涯错误的事——可若重来,我会谨慎处理。我加入了 Amazon S3。
只要你用过 AWS,你就认识 S3。它是 AWS 的第一款产品,发布于 2006 年的圆周率日(Pi Day)——和我后来加入亚马逊是同一个日子——并且它至今仍是云的基础数据服务,地球上规模最大的数据服务(5000 亿+ 对象、2 亿+ 请求/秒、数 EB 级数据、123 个可用区、39 个区域——在它 20 岁生日时)。当时——直到今天——它都由副总裁 Mai-Lan Tomsen Bukovec 领导,早在我加入之前,我就读了大量她对内对外的文字。她是一位充满热情、能力极强的资深领导者——也是 AWS 最杰出的女性与亚裔领导者之一。
向 Mai-Lan 学习
Mai-Lan 拥有一套出了名成型的关于如何运转一个成熟组织的理论与实践——包括对 Principal 与资深 IC 领导者角色的一份清晰、广为流传的定义。如果说 Deepak Singh 那些深受喜爱的周五周记是一股非正式的想法与观察之流,那么 Mai-Lan 的则是一套完全成型的体系。
她在 S3 组织里主持每月一次的领导力学习会,并鼓励每一位 L6+ 的领导者来参加、来讨论。她的 TA 会为每一期准备主题——通常是某条领导力准则(Leadership Principle)——连同事先分发的配套文档,再进入一场真正正式的讨论。我几乎每场都参加,还亲自讲过一次——那期主题是 “识别关键时刻(Recognizing Critical Moments)”。巧合的是,那天另一位分享者是 Bryan Liles——AWS 的 Senior Principal Engineer(L8),此前是 CNCF 社区里的高知名度人物——他和我是同一周加入 S3 的。亲眼看到一位成熟的 VP 级领导者如何阐述自己的观点、并刻意培养更年轻的领导者,对我极其宝贵。
底下那个错
尽管组织如此精良,你仍然逃不开一个事实:一线经理的水平参差不齐。在我挑选新团队、和招聘经理沟通时,我没有做好本应做的预期管理。我从作为 tech lead 向一位 L7 经理汇报,变成在新团队里向一位初级一线经理汇报——在范围和位置上都是严重的降级。与此同时,我高估了那位初级经理的职业成熟度——他才刚从 IC 转为管理角色。
而我谁也怪不了,只能怪自己。我在 App Runner 拥有的杠杆与地位,是我在那里挣来的,而这次转岗是我自己的决定。从新经理的视角看,我不过是一名刚晋升的 L6 IC——他没有理由去考虑我习以为常的那种工作与交付密度。这个组合,从一开始就注定不合拍。
一次快速的纠偏
几乎一到岗,我就认出了这种错配,于是迅速行动。我在 App Runner 时的前经理,恰好也在相近的时间转去了 S3 的另一个部门——于是我又转去向她汇报。正是从那里开始,我才真正理解一个超大规模的产品组织是怎么运转的——与 App Runner 那个更小、更像创业公司的世界形成鲜明对照。
落地 S3 Index
我最终落脚的团队是 S3 Index。我们的总监 Amy Therrien 在一场 re:Invent 演讲里把业务讲得很清楚; 简单说,我们做的是降低 503——也就是客户在大规模下会遇到的限速与节流错误。值得一提的是:若把它视为一个独立产品,S3 Apex(API + Index)会跻身亚马逊营收前 20 的业务之列——它当然也是 S3 的 tier-1 业务。
RocksDB、Meta,与一个十岁的数据库
因为我的背景——以及一位了解我背景的经理的推荐——我接到了一个真正独特的任务:一项与 Meta 的跨公司协作,深入研究开源的 LSM-Tree 数据库 RocksDB(恰好那一年它满十岁)。我们每周与 Meta / RocksDB 的工程师同步,用 C++ 工作——和我 Envoy 时期是同一门语言。数据库引擎并非我的主场,但这恰恰是重点:它是个很值得学的东西。
我们在伦敦的兄弟团队把这部分业务移交给了西雅图这边,所以我们也定期与伦敦及欧洲的同事开会——一种相当有意思的跨时区协作。
一个小小的注脚,能说明这个行业角落有多鲜活:RocksDB 团队所在的公司 Rockset,于 2024 年 6 月 21 日被 OpenAI 收购。
2023:读懂这波 AI 浪潮
2023 年是特别的一年。ChatGPT 于 2022 年 11 月 30 日问世,整个世界开始躁动。因为已经亲历过一次对话式 AI 的炒作,我刻意选择了一种观望、保守的态度——我想,这也许只是又一个过早见顶的周期,就像当年的语音 AI。
真正改变我看法的,是来自内部、而非头条的一个信号。Deepak Singh——彼时是庞大的 DECS 组织(Developers、Events、Containers、Serverless)的副总裁——转去向 AI & Data 副总裁 Swami Sivasubramanian 汇报,接手 AWS CodeWhisperer。(Swami 在阿姆斯特丹自由大学读的博士,导师是 AWS CTO Werner Vogels;他是亚马逊最年轻的副总裁之一,并于 2023 年 9 月加入公司的 S-team。)当一位这种量级的领导者把自己的职业重心重新指向生成式 AI 时,它就不再像是炒作了。那一刻我意识到:这波浪潮是真的——是时候认真对待了。
目标:找到一个 GenAI 团队
从那时起,目标很明确:找到一个 GenAI 团队加入。一如既往,这是一次下注——而这次并不顺利。因为 GenAI 的崛起对所有人——包括公司——都是计划之外的,Swami 的组织一直处在持续重组与资源调配的状态。大多数团队都冻结了招聘,等尘埃落定;而当一个团队缺人时,惯常做法就是直接从另一个产品调工程师过来——比如早期的 Bedrock,很大程度上是由来自 Lex 的工程师组成的。与此同时,组织外面也排着长队:人人都想挤进这个前沿领域。
有一阵子,对一个组织外的人来说,仿佛根本没有入口,我几乎要放弃了。但随后,转机出现了。(我一直信奉一条原则:当你打算放弃某件事时,先再给自己三个月。)
三个 Offer
转机以三个 offer 的形式到来,全都在 GenAI:
三个 offer,全在 GenAI。
抵达的路径有点不寻常。我面试的是 Bedrock Fine-tuning,走完了非正式的面试 loop,但招聘经理回来说 headcount 已经被填满了。他没有就此作罢,而是把我推荐给了两个还有空缺 headcount 的兄弟团队——Bedrock Agent 和 Bedrock Inference——而且因为我已经走完了非正式 loop,这两个我都不必再面一次。
于是,权衡再一次摆上台面——三个不同的 GenAI 押注,各有侧重、也各有自己的招聘经理(SageMaker AI Optimization 是 L6 (M2),Bedrock Agent 是 L7,Bedrock Inference 是 L6 (M1))。在 S3 刚给我上过那堂“范围”的课之后,汇报层级不再是一个抽象概念——它成了我如今会认真权衡的因素之一。
我选择了 Bedrock Inference。我的想法是从栈的最底层切入——从最基础的那一层开始,自下而上地搭。今天回头看,我得诚实地说:我缺乏前瞻(prescience)。我与 Bedrock Agent 擦肩而过——它在 2025 年被重组进 Swami 全新的 Agentic AI 组织——而三年之后,智能体(agentic AI)已成为整个 LLM 领域最火的实践之一。押注基础设施,直觉是对的;我只是没看清它上方的前沿会移动得多快。
生成式 AI — Amazon Bedrock & SageMaker AI
'16–'19
'19–'20
'20–'23
'23
'23–
+ LLM 浪潮——直到 2022 年底前几乎为零,随后垂直起飞,到 2026 年仍在攀升。
第 2 号工程师
我加入时,这支推理(Inference)团队刚刚立项。我是招进来的第一名工程师,另有一位从 Lex 调来的 L6 工程师——这让我实际上成了第 2 号工程师。这是一个享有特权、且高度机密的项目:在 Bedrock 上为 Anthropic 的模型构建推理引擎。
于是,我又一次成了核心创始工程师,从零开始构建——干净的代码、高能见度、快节奏。又一个创业公司,只是这一次,它在公司最重要的业务内部。而它的成长也像创业公司:一年之内从 2 人长到 50+ 人。(我把第一年更详细地写在了这里。)
与 Anthropic 一起构建
我们直接与 Anthropic 合作——包括联合创始人 Ben Mann——每周 2 次以上的工作会,每次新模型发布还有一场月度 happy hour。那段工作触及了一些如今已成为开源推理社区公认标准的想法——分离式推理(disaggregated inference)、多节点推理、上下文感知路由(context-aware routing)、以及提示缓存(prompt caching)。
Day-0 首发 & Trainium
整个 2024 年,我主导了 Claude 几乎所有在 Bedrock 上的 Day-0 公开发布——包括 Computer Use,Anthropic 第一个智能体(agentic AI)特性,搭载在 Claude 3.5 v2 上。我还交付了 Claude 在 AWS Trainium/Inferentia 平台上的首次发布——与 James Bradbury(Anthropic 的算力负责人)以及 Annapurna 团队合作。
业务高速增长
“SemiAnalysis believes Bedrock is a $5.5B run rate business today with the vast majority of customers (80-90%+) using Anthropic models.”
— SemiAnalysis
那是一种辛苦之后才会有的奇妙感受——看着自己负责的、作为公司核心业务之一的东西飞速成长。而伴随这份成长,压力也成倍增加。
我也近距离见证了 Anthropic 自身的成长——从一个或许还不到 200 人的团队,长到了今天的样子,估值从 2024 年的约 180 亿美元 一路攀升到 2026 年中的 9650 亿美元——50 倍以上的跃升,让它成为全球估值最高的初创公司。2024 年在 CVPR 上,当我跟人说我在做 Claude 模型时,常常换来一脸茫然——他们并不知道那是什么。而今天,我几乎可以肯定,人人都知道、也都在用 Claude,Anthropic 已是全球家喻户晓的名字。
近距离看最高层领导
这也是我第一次真正感受到,亚马逊最高层的领导有多在意一项业务。我们会直接收到来自客户公司 CEO 的工单。每一次 Sev-2 复盘,都至少有两位 L10 级别的领导参加——我们的副总裁和我们的 Distinguished Engineer——而我直接与 L8 Senior Principal Engineer 共事。这是我第一次与那种高度的领导者如此近距离协作。高能见度,高压力。
因为这项工作太早、太稀缺,全世界真正动手做过前沿模型推理的人寥寥无几。Anthropic 当时还不是家喻户晓的名字,公司也把这个项目守得很紧——出于P&C与知识产权保护(替 Anthropic 着想)的原因,连其他 Bedrock 的同事都无法访问我们的代码库或文档。
我也得以认识了那些日常一起工作的 Anthropic 同事,并与同领域的其他人建立连接。随着时间推移,其中一些连接变成了真正的关系——投资人、创始人、研究者,如今都是我人脉的一部分。这一章里最有价值的部分,有些根本不在代码库里;而是这份工作把我放在了哪些人身边。
选择横向一步
在 2025 年 Q1 交付完最后一个任务后,我知道自己想停一停了。休假回来,我约了个会,和两位 Senior Principal Engineer 坐下来,告诉他们我想试试一个不同的方向。他们表示理解,并给了我几个选择。
当时 Bedrock Inference 分为三个部门:3P(Claude 及其他闭源模型)、2P(Llama、DeepSeek、Qwen 等开放权重模型)、1P(亚马逊自家的 Nova 与 Titan 模型)。我从 3P 转到了 2P,深入开放权重的世界,开始了我在模型优化(投机解码 speculative decoding)与定制化方向的研究——这条线我一直做到今天。我会做文献综述、读论文、研究前沿的方法论,并把它们应用到开放权重模型上。差不多同一时间,我也把办公地点从 西雅图(Seattle, WA) 搬到了 贝尔维尤(Bellevue, WA)——相比城市,这是一座更小、更安静的小镇。
这些年里,我还在两家公司、跨多轮新模型公开发布中,主导了基准测试方法论的搭建。通过这项工作,我获得了同时优化和评测开放权重模型与闭源基础模型的深入经验。
投机解码,与时间故事线的闭环
我很幸运地被任命为 NeurIPS 2025 墨西哥城 的 co-chair——从参会者(2019)到组织者(2025)。
另一个时间故事线的闭环,关乎投机解码本身。我第一次听到投机解码(speculative decoding)这个概念,是在 PyTorch Conference 2024(旧金山)上,Xiaoxuan Liu 的一场演讲里。坐在台下的我,完全没想到一年之后自己会深扎进这个领域——它会成为我如今主导的那条研究线。
就在两周前的 MLSys 2026 Bellevue,当我在会场看 poster 时,看到一篇名为 “ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems” 的 oral paper。因为它和我在公司负责的投机解码 RL 业务相关,我非常感兴趣,便在现场和其中一位作者 Qiaoling Chen 交流、学习,想更深入地了解这篇论文。后来我很惊讶地发现,作者里有一个非常熟悉的名字——Tianwei Zhang(张天威,南洋理工大学),也就是 Qiaoling 的导师。还记得我在第二章提到过自己在 Alexa AI Health & Wellness 部门吗?Tianwei 正是我当时同一间办公室的同事。这大概是我目前遇到过最奇妙的缘分。当你在一个会议上读到一篇与自己负责的业务相关的论文,发现其作者的导师竟是你的前同事时——尤其是在你跨越了多个不同领域之后——一切的发现都变得合理而自然起来。我还在现场见到了 Xiaoxuan——她凭借新论文 “Speculative Decoding: Performance or Illusion?” 获得了 MLSys 最佳论文荣誉提名(Best Research Paper Honorable Mention)。
我现在在哪
十年,五章,一只指南针。从金融科技到对话式 AI,从容器到云的地基,再到如今生成式 AI 的前沿——每一次转身,都是拿舒适去换一条更陡的学习曲线;而每一次,我都拒绝放弃的那点好奇心,照亮了通往下一扇门的路。决策的透镜越来越锋利;而下注,从未改变。
职业优先级
如果把十年的这些决定压缩成一张图,几股力量几乎能解释每一次转身。它们不是单一的一架梯子——而是构成一只飞轮,外加两条彼此独立的轴(角色与领域),全都骑在上面那些技术周期之上。而在 AI 时代,它们脚下的地面正在移动,所以值得一问:什么仍然有价值,什么不再有。
这一切的基本单元:更多的杠杆赢得范围与影响力,进而换来能见度、评级、工作稳定性,以及更高的职级与角色——而更大的角色又会交给你更多杠杆去花。但请留意中心那根轴:轮子只在信任之上转动。影响力之所以能换成能见度、能见度之所以能换成范围,全因为人们信任你所说的、也愿意把更多托付给你——而与大多数技能不同,信任可以跨领域迁移,也不会被 AI 商品化。轮子,正是靠它转起来的。而最初推动它转动的,是你自己的驱动力——这份工作有多能激发你的动力,以及你投入其中时有多高产:信任是那根轴,动力与生产力则是让它转动的扭矩。
一台连通的引擎:技术周期(横轴)决定哪个领域是热的,而越热的领域位置越高——意味着更多的杠杆、机会与可能性。每个领域内部,都转着上面那只飞轮。你靠跳到下一个热门领域往上爬,而这些跳跃,是由可迁移的知识(基础设施、系统思维、品味、学习速度)驱动的——而那些被 AI 商品化、且被领域锁死的技能(死记的语法、背下来的 API、单一工具、把守门槛的头衔)则留在了原地。在任何一级台阶上,你的角色都可以朝三个彼此独立的方向移动:升一级职级、换一种立场(IC ↔ 管理),或转入一个不同的职能族(在 AI 时代,越来越融为一个 “Builder / MTS”)。
诚实的张力在于:纯粹为飞轮做优化,会让你安全但狭窄;而跨领域追随兴趣,会拓宽地图,却在每次跳跃时把循环重置一点点。换领域是有成本和风险的——而事物又都是动态的,这种权衡从不静止;如何在其中做 tradeoff、保持动态平衡,本身就是一门艺术。我的整个职业生涯,都押在了领域这条轴、以及上面那张图的左栏上——而正如下一节所示,飞轮大体上还是追了上来。
关于晋升——以及评级——的一点话
给在意这些的人看:我两次晋升,都不是第一次就过的:
- SDE I → SDE II——2017 年 Q1 失败过一次,随后在 2018 年 Q2 成功(AWS Payments)。
- SDE II → SDE III——2021 年 Q4 失败过一次,随后在 2022 年 Q1 成功(AWS App Runner)。
还有一点,我想告诉每一位更早期的人:即便有两次晋升,我在内部也从没拿到过 “Exceeds” 或顶级的年度评级。到了某个时刻你会意识到,评级其实没那么重要——真正重要的,是工作本身,以及你在做它的过程中成为了谁。
还有一个细节值得记下来:在我晋升的那个周期,公司硬性要求把 Tech Promotion Assessment(TPA,技术晋升评估)作为晋升材料的补充。惯例是 L5→L6 晋升配 1 个 TPA,L6→L7 晋升配 2 个 TPA。TPA 本质上是一位独立的第三方,会去访谈你周围的同事和经理,对你做一次资格审查(qualify review)。有意思的是,担任 TPA 的通常是处在你目标级别的人。我这次,做我 TPA 的那位 L6 工程师后来晋升成了 Principal Engineer——所以某种程度上我运气不错,相当于得到了一份从 PE 视角出发的评估。
还有个好笑的后续:在我晋升那个季度之后没多久,公司就彻底取消了 L6 的 TPA 要求——不过 L7 晋升仍然需要。原因是有人抱怨这个流程太耗时间、太占资源——一个 TPA 往往要花上好几个月去收集你的数据和工件,再写出一份详尽的评估报告。随着公司快速扩张、人越来越多,这套流程根本无法规模化。
关于 IC 与 Manager 的一点话
这些年里,在我待过的每一个团队,都曾有领导邀请或建议我从 IC 转向管理岗(Manager)——前后至少三次以上。我甚至也曾以 people manager 的身份短暂带过团队。但到目前为止,我依然选择了 IC 这条路——至少现在如此。
最早,我被 Carlos Arguelles(亚马逊 Senior Principal SDET)的博客 《Belonging to Amazon's Principal Engineering Community》,以及 Tanya Reilly 的 《The Staff Engineer's Path》 深深感染。我也很幸运地和 Carlos、Tanya 都有过面对面的交流。很长一段时间里,我都很喜欢他们所描绘的那幅 IC 图景。而在亚马逊,Principal Engineer 的数量远远少于同级别的 people manager——这让资深 IC 这条路显得更稀缺,也更吸引我。于是,我婉拒了大多数转管理岗的邀约。
话虽如此,我确实上过内部的 I2M(IC-to-Manager)课程。它给了我很多视角——这些视角实实在在地帮助我在公司范围内以资深 IC 的身份开展工作。对于 IC 与 Manager 这两条路,我始终保持开放的心态。
而随着当下这一波技术创新,我也观察到组织结构正在剧烈改变。关于 IC 与 Manager 到底哪条路更好,眼下我并没有答案——但我会继续观察下去。
告别信
一个小小的私人存档——我离开每个团队时发出的告别信。(原文均为英文。)
AWS Payments
Subject: Farewell CP & AWS!
Dear Bill Runners,
What a long time! It have been 3 years. As part of you already know, I am going to explore an unknown journey outside of aws in 2019. I joined CP right after my master graduation. And it is time for me to graduate from CP. I would like to thank you for your help and support along the way. You will be missed!
Grateful for all of lovely peers I worked with, you gave me a plenitude and meaningful three years. We have accomplished a lot together. Lots of incredible things happen: MDB Disaggregation, Consolidated Invoicing, EMEA CSOR, 10x, CICD.. It is an unforgettable experience working with such a talented group of people. I do enjoy it and learnt a lot.
I want to take the opportunity to say thank you to Martin Larricart, Jeff Zhang, Richard Rothstein and Weiyan Zhong for their continue support and coaching. You like lighthouse oversea that guide me correct direction on my career growth.
I will still be in Amazon. I wish you all the best in your future endeavors and hope to stay in touch.
Best Regards,
Yiming
Alexa Health & Wellness
Subject: Farewell Alexa health
Hi Team,
Today is my last day on Alexa health team. As part of you already know, I am going to back to aws for a new track in container and serverless area. Time flies, I joined Alexa health on 2019-3. And I have been in Amazon 4yr3mons. I always ask myself what things will look like if having chance to be a 5-year Amazonian one day. It is the hardest decision this tough year. Hope it will work out in the end.
Begin with the date I joined amazon, I start hearing alexa news. 2016, might be the fastest growth period of alexa. Voice assistant is such a cool product and unknown domain for me at that moment. I felt grateful meeting such an opportunity in 2019 unboxing it, joining it and participating in building it. I learnt a lot.
I would like to thank all of lovely peers I worked with. I still have a fresh memory how we launched medication management system from scratch in 2019. We have accomplished a lot together in this one half year: medication management, refill, hipaa self-service, a4hc, ehr auth session.. Team's energy and cohesion impressed me. I might never ever be able to find a team with such good atmosphere. I am so proud of being part of it. I will miss all of you guys.
I want to take the opportunity to say thank you to Litao, Haiyang. Appreciate for trust and all of opportunities. I did feel being valued. I also want to thanks all of leaderships in Alexa health org. Appreciate for constructing such a transparent, connected work space.
Again, thanks for having me on the team. I wish you all the best in your future endeavors.
I will still be in Amazon. My new office is couple blocks away from blueshift. Don't be a stranger, it is a small industry and let's keep in touch.
Linkedin: linkedin.com/in/pengyiming
Best regards,
Yiming
AWS App Runner / Containers
Dear friends,
Appreciate your time. Reach out here to share some recent updates. (Excuse me, I know, this is the “season”..🙁)
I will join another product team under Storage org in next week. It is the hardest decision to make. Especially leaving a product babysitting from 0 to 1 since its stealth mode. I love our teams, products and people here.
Back to Day1 when I was at Fargate, Mats reached out to me and we first-time talked about project “Fusion” (App Runner's code name):
“It's all goodness for us, as Elastic Beanstalk has joined our org and we are doing an exciting project with them this year. This is a S-team goal for us, targeting a re:Invent launch and we think it will change how customers think about running applications on AWS. The project is code named Fusion and builds on top of Fargate.. Sound interesting?”
It has been almost 3 years for me being in Elastic Containers org. Fargate on Firecracker, RoadRunner (now it is “Seekable OCI”), Fusion→App Runner GA, VPC, X-Ray, Route53, Private Service, WAF etc. It has been such a long non-stopped running. Now it is the time I plan to slow down a little bit and revisit next step in life as a new mid-age young man.
I am gratitude on all the journey and lovely people I have met and worked with in past few years.
Sincerely best wishes to all of my dear teammates and friends here. See you around, I will be around.. “Thanks” to Pandemic, I still haven't got chance yet meet with part of you in person. New office location is across the street away. Happy to grab a coffee together if there is an opportunity. Let's keep connection.
I am still passionate in Cloud-Native, Containers, Serverless and Open-Source. No plan to stop invest energy in these technical fields. Part of you might know, I am hosting an open-source community “CloudNative-Serverless-Meetup” on GitHub. Happy to keep in touch from there if you like it. Seats offered on the couch.😉
Never say never, it is a small world. Containers “Days” is enjoyable. GO ECS! GO App Runner! GO Beanstalk!
Best and sincerely,
Yiming
linkedin.com/in/pengyiming
Amazon S3
跟随好奇心,职业自会随之而来。