This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re looking for hands-on builders–intellectually curious, deeply technical leaders eager to shape the future of AI and ML at Whatnot. You’ll lead the development and scaling of the core infrastructure that powers machine learning and self-hosted large language model applications across the company, working side by side with machine learning scientists to bring cutting-edge models powered by near-realtime features into production and unlock entirely new product experiences. This means building systems that make advanced ML dependable and fast at scale–from low-latency deep learning model serving and streaming feature ingestion to distributed training and high-throughput GPU inference. This is a management role that requires strong technical depth–potential candidates should be excited about getting and staying in the weeds. You will be expected to up-level architectural discussion, provide technical feedback, and code at least a day a week.
Job Responsibility:
Own the infrastructure powering AI and ML models across critical business surfaces–supporting growth, recommendations, trust and safety, fraud, seller tooling, and more
Guide the prototyping, deployment, and productionization of novel ML architectures that directly shape user experience and marketplace dynamics
Help design and scale inference infrastructure capable of serving large models with low latency and high throughput
Oversee and evolve real-time feature pipelines that feed both our online and offline stores, ensuring single-second feedback from behavioral signals, high reliability, and model training fidelity
Drive feature platform improvements and expand scope to cover non-ML use cases such as fraud rules where point-in-time backtesting is also critical
Lead the development of distributed training and inference pipelines leveraging GPUs and both model and data parallelism
Optimize system performance by managing resource utilization and developing intelligent feature caching strategies
Empower scientists to iterate faster by building abstractions, APIs, and developer tools that simplify the development of near-realtime features and model iteration
Roll out ever-better ergonomics around model training and deployment
Stretch beyond your comfort zone to take on new technical challenges as we scale AI across Whatnot’s ecosystem
Requirements:
4+ years of engineering management experience developing production machine learning systems at consumer-scale loads
Bachelor’s degree in Computer Science, Statistics, Applied Mathematics or a related technical field, or equivalent work experience
5+ years of hands-on software engineering experience building and maintaining production systems for consumer-scale loads
1+ years of professional experience developing software in Python
Ability to work autonomously and drive initiatives across multiple product areas and communicate findings with leadership and product teams
Experience with operational, search, and key-value databases such as PostgreSQL, DynamoDB, Elasticsearch, Redis
Experience working with with ML-specific tools and frameworks such as MLFlow, LitServe, TorchServe, Triton
Firm grasp of visualization tools for monitoring and logging e.g. DataDog, Grafana
Familiarity with cloud computing platforms and managed services such as AWS Sagemaker, Lambda, Kinesis, S3, EC2, EKS/ECS, Apache Kafka, Flink
Professionalism around collaborating in a remote working environment and well tested, reproducible work
Exceptional documentation and communication skills
What we offer:
Generous Holiday and Time off Policy
Health Insurance options including Medical, Dental, Vision
Work From Home Support
Home office setup allowance
Monthly allowance for cell phone and internet
Care benefits
Monthly allowance for wellness
Annual allowance towards Childcare
Lifetime benefit for family planning, such as adoption or fertility expenses
Retirement
401k offering for Traditional and Roth accounts in the US (employer match up to 4% of base salary) and Pension plans internationally
Monthly allowance to dogfood the app
Parental Leave
16 weeks of paid parental leave + one month gradual return to work