Manage and scale multi-hundreds of millions in ARR products. Model inference optimization, customization integration and benchmark. Founding Engineer Inference Optimization (SageMaker AI & Bedrock Science) 0→1 (re:Invent 2025): Led adaptive inference optimization for open-weight models, enabling customers to optimize models for different use cases and priorities, including throughput (speculative decoding), latency (kernel tuning), and cost efficiency. Architected and built draft model training system: distributed training, hidden states preparation, dataset format conversion, multi-phase training, hyperparameters optimization (HPO), training duration optimization etc. Collaborated with George Karypis (ex-Senior Principal Scientist, AWS; Professor, University of Minnesota; 2025 ACM SIGKDD Innovation Award recipient).; Bedrock Anthropic LLM Inference 0→1: Designed and built Bedrock’s Anthropic inference engine (Bedrock Forklift), with capabilities disaggregated and multi-node inference, context-aware routing, tokenization, and trust & safety. Led cross-company benchmark methodology setup and production releases of Claude 3 Opus, 3.5 Sonnet v2, and Haiku. Partnered with Ben Mann (Anthropic Co-founder); Claude inference on Trainium chip 0→1 (re:Invent 2024): Delivered the first Claude model release on Trainium (Inferentia) platform and enabled 90% of Bedrock traffic to run on Trainium chips. Partnered with James Bradbury, Anthropic Head of Compute and Ron Diamant, VP & Distinguished Engineer at Annapurna ML; Anthropic–Palantir Partnership classified environments compute 0→1. (Featured in Q1 and Q3 2024 earnings reports and publicly underscored by CEOs.) More Privileged & Confidential initiatives.
UTL an organization of 70 (4 two-pizza teams) managing two products with hundreds of millions in ARR. DockerCon 2022 co-presentation with Inbal Shani (CPO, Twilio; former CPO, GitHub).
Founding Engineer AWS App Runner 0→1, serverless compute product incubated from Fargate. Featured in Q2 2021 earnings report and publicly underscored by CEO. Founding Engineer AWS App Runner Observability 0→1.
Founding Engineer medication management 0→1, featured in Q4 2019 earnings report, & HIPAA-Eligible Skill 0→1 (2020). Partnered with Rachel Jiang, Yvonne Chou (Chief of Staff, AI2), and Missy Krasner (former VP, Box).