Manage and scale multi-hundreds of millions in ARR products. Model customization & inference optimization and evaluation & benchmark. Founding Engineer Inference Optimization (SageMaker & Bedrock Science) 0→1 (re:Invent 2025): Led adaptive inference optimization for open-weight models, launching a public feature that allows customers to use these capabilities and achieve 2–3× OTPS gains with minimal quality loss through speculative decoding and quantization. Architected and built draft model training system: distributed training, hidden states preparation, duration optimization etc. Collaborated with George Karypis (Senior Principal Scientist, AWS; Distinguished McKnight University Professor, University of Minnesota; 2025 ACM SIGKDD Innovation Award recipient).; Bedrock Anthropic LLM Inference 0→1: Designed and built Bedrock’s Anthropic inference engine (Bedrock Forklift), with capabilities disaggregated and multi-node inference, context-aware routing, evaluation and benchmark, tokenization, and trust & safety. Led production releases of Claude 3 Opus, 3.5 Sonnet v2, and Haiku. Partnered with Ben Mann (Anthropic Co-founder); Claude inference on Trainium chip 0→1 (re:Invent 2024): 90%+ Bedrock traffic on Trainium. Partnered with James Bradbury, Anthropic Head of Compute and Ron Diamant, VP & Distinguished Engineer at Annapurna ML; Anthropic–Palantir Partnership classified environments compute 0→1. (Featured in Q1 and Q3 2024 earnings reports and publicly underscored by CEOs.) More Privileged & Confidential initiatives.
UTL an organization of 70 (4 two-pizza teams) managing two products with hundreds of millions in ARR. Presented new product features at DockerCon with Inbal Shani (CPO, Twilio; former CPO, GitHub).
Founding Engineer AWS App Runner 0→1, serverless compute product incubated from Fargate. Featured in Q2 2021 earnings report and publicly underscored by CEO. Founding Engineer AWS App Runner Observability 0→1.
Founding Engineer medication management 0→1, featured in Q4 2019 earnings report, & HIPAA-Eligible Skill 0→1 (2020). Partnered with Rachel Jiang, Yvonne Chou (Chief of Staff, AI2), and Missy Krasner (former VP, Box).