This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference. We are looking for an engineer who wants to take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.
Job Responsibility:
Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production
Work alongside researchers to enable advanced research through awesome engineering
Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack
Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues
Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware
Requirements:
Understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference
Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done
At least 5 years of professional software engineering experience
Familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc
Experience architecting, building, observing, and debugging production distributed systems
Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale
Are self-directed and enjoy figuring out the most important problem to work on
Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed
Nice to have:
Bonus point if worked on performance-critical distributed systems
What we offer:
Offers Equity
Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible
Relocation support for eligible employees
Additional taxable fringe benefits, such as charitable donation matching and wellness stipends