Inference Technical Lead Job at OpenAI (San Francisco)

Senior AI Technical Lead

Location

India , Bengaluru

Salary:

Not provided

Randstad

Expiration Date

August 23, 2026

Requirements

Bachelor's degree in Computer Science, Engineering, or related field
6+ years of experience in AI/ML engineering, with strong expertise in Generative AI and agentic systems
Proficiency in Python and modern programming languages
experience with orchestration frameworks (LangChain, Hugging Face, OpenAI)
Strong knowledge of cloud platforms (AWS, Azure), containerisation (Docker, Kubernetes), and CI/CD pipelines
Experience with infrastructure-as-code (Terraform, CloudFormation) and API management
Familiarity with observability tools and secure coding practices
Excellent leadership, stakeholder engagement, and communication skills
Proven ability to lead technical teams and deliver complex AI solutions in agile environments

Job Responsibility

Lead technical design decisions and resolve blockers across portfolio use cases
Drive cloud-native development practices, including secure handling of secrets, environment variables, and service configurations
Manage API specifications and service integrations across platforms
Ensure robust CI/CD pipelines for AI systems and microservices
Implement team-based access control, code ownership, and repository governance
Apply infrastructure-as-code practices for scalable deployments
Support agile software development processes and ensure engineering consistency
Design and deploy conversational agents and multi-agent systems
Orchestrate agent workflows, manage context, and integrate external data sources
Implement retrieval-augmented generation (RAG) patterns and scalable inference strategies

What we offer

Inclusive parental leave entitlements for both parents
Values led culture
Flexible work options
Generous annual leave, sick leave and casual leave
Cultural and religious leave with flexible public holiday opportunities
A competitive remuneration package featuring performance based incentives with uncapped Employer Provident Fund

Fulltime

AI Engineer - Technical Lead

We are seeking a highly skilled and innovative AI Engineer – Technical Lead to j...

Location

India , Chennai

Salary:

Not provided

OptiSol Business Solutions

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, ECE, or related fields
Strong proficiency in Python with libraries such as TensorFlow, PyTorch, OpenCV, Scikit-learn
Solid understanding of deep learning concepts, optimization techniques, and transfer learning
Hands-on expertise in Computer Vision (Detection, Segmentation, Tracking, Recognition)
Experience with Edge AI deployment and optimization on Jetson and GPU-based devices
Proficiency in electronics prototyping and AI hardware integration
Experience integrating Machine Vision cameras & lenses
Working knowledge of RTSP, TCP/IP, Modbus, UART, MQTT
Experience in Mobile SDK / Framework development and delivering integrated applications
Hands-on experience with Arduino, Raspberry Pi, servos, relays, motor drivers, Ethernet switches

Job Responsibility

Lead and mentor a team of AI engineers and researchers in delivering high-quality solutions
Foster a culture of innovation, collaboration, and technical excellence
Work closely with stakeholders to define requirements, timelines, and deliverables
Design and develop advanced machine learning and computer vision algorithms for industrial use cases
Implement solutions involving object detection, segmentation, tracking, and recognition
Oversee data collection, preprocessing, and feature engineering pipelines
Drive research initiatives and integrate emerging AI technologies into production systems
Optimize and deploy AI models on edge devices such as NVIDIA Jetson (Orin/Nano), Raspberry Pi, and GPU-based platforms
Configure edge systems for real-time AI inference and production readiness
Design and prototype electronic systems for AI model integration

What we offer

Opportunity to work on cutting-edge AI, vision, and edge-computing solutions

Fulltime

Lead Machine Learning Inference Engineer, Advertising

Roku is changing how the world watches TV. Roku is the #1 TV streaming platform ...

Location

United States , Austin

Salary:

Not provided

Roku

Expiration Date

Until further notice

Requirements

M.S. or above in CS, ECE, or a related field
10+ years of experience in developing and deploying large-scale, distributed systems, with at least 5 years in a leadership or technical lead role
Strong programming skills in high-performance languages
Deep understanding of inference frameworks and ML system deployment
Proven experience optimizing performance for large-scale machine learning systems, including a deep knowledge of SOTA model optimizations, hardware-software co-design, GPU acceleration, and HPC techniques
Excellent communication and collaboration skills
Experience leading teams working on high-throughput, low-latency ML serving systems
Experience collaborating with and leading global, cross-functional teams
Contributions to open-source ML or systems projects

Job Responsibility

Lead the design and development of a SOTA Inference platform
Oversee the development of monitoring, observability, and other tooling to ensure system and model performance, reliability, and scalability of online inference services
Identify and resolve system inefficiencies, performance bottlenecks, and reliability issues, ensuring optimized end-to-end performance
Stay at the forefront of advancements in inference frameworks, ML hardware acceleration, and distributed systems, and incorporate innovations where and when they are impactful

What we offer

Global access to mental health and financial wellness support and resources
Healthcare (medical, dental, and vision) where applicable
Life, accident, disability, commuter, and retirement options (401(k)/pension) where applicable
Time off in accordance with local leave policies

Fulltime

Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing ba...

Location

United States , San Jose, California; San Francisco, California; New York, New York; Cambridge, Massachusetts; McLean, Virginia

Salary:

229900.00 - 286200.00 USD / Year

Capital One

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
At least 6 years of experience programming with Python, Go, Scala, or Java

Job Responsibility

Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products that change how our associates work and how our customers interact with Capital One
Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One

What we offer

Cash bonus(es)
Long term incentives (LTI)
Comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being

Fulltime

Database Technology Product Manager

We are looking for a Database Technology Product Manager to lead the strategy, d...

Location

United States , Los Angeles

Salary:

Not provided

Robert Half

Expiration Date

Until further notice

Requirements

Strong academic background with relevant credentials that support advanced work in product, data, or technology leadership
Demonstrated success building and leading comparable programs, with a track record of delivering results in complex or emerging environments
Deep knowledge of machine learning engineering, data modeling, and large-scale datasets, with the ability to engage credibly with technical teams
Expertise in hospitality customer behavior and the application of data science and marketing insights to product development
Executive-level communication skills with experience presenting strategic updates, investment rationale, and performance outcomes to senior leadership or boards
Entrepreneurial mindset with the ability to operate effectively in fast-moving, ambiguous, and high-growth settings
Proven ability to make sound, data-informed decisions while balancing risk, opportunity, and business impact
Willingness to commit to a long-term initiative and create visible value quickly in a contract setting with potential for a permanent role

Job Responsibility

Lead the full product lifecycle for database and data science initiatives, from early concept development through deployment, measurement, and ongoing enhancement
Act as the central point of coordination between executive leadership, business stakeholders, and engineering teams to keep priorities aligned and delivery on track
Establish and grow a technical product management function focused on research and development initiatives within the organization
Define product strategy, investment priorities, and roadmap decisions for proprietary data products that support customer insight and business growth
Develop a data science inference capability that converts large and complex datasets into recommendations, decision support, and behavioral signals across hospitality experiences
Deliver clear updates to senior executives and board-level audiences, clearly outlining progress, risks, outcomes, and future plans
Guide prioritization and governance discussions by evaluating trade-offs, setting benchmarks, and supporting funding-related decision processes
Partner with technical teams to translate machine learning, data modeling, and analytics concepts into practical product direction and measurable value
Help bring specialized products to market responsibly while serving as a strategic interface between developers and customer-facing partners

What we offer

medical
vision
dental
life and disability insurance
401(k) plan

Principal Software Engineering Manager - Substrate Efficiency

M365 Copilot inference is a high-impact engineering team advancing applied AI an...

Location

United States , Redmond

Salary:

142800.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Job Responsibility

Build and lead a high-performing engineering team focused on inference runtime efficiency and model execution performance
Define and drive strategy to improve throughput per GPU through runtime optimizations
Increase engineering agility, enabling faster experimentation, iteration, and rollout of performance improvements
Partner across M365 Core, AI Core, Azure, and Microsoft Research to co-design and productionize advanced inference optimizations
Establish metrics, telemetry, and experimentation frameworks to measure efficiency gains and guide investment decisions
Own live-site performance, reliability, and operational excellence for inference engines at scale
Drive alignment across partner teams on engine interfaces, performance goals, and optimization priorities.

Fulltime

Senior Machine Learning Engineer, AI Platform

Location

United States; Canada

Salary:

128000.00 - 171000.00 CAD / Year

Mozilla

Expiration Date

Until further notice

Requirements

Bachelor’s degree with 4–6 years of relevant industry experience, or Master’s degree with significant hands-on experience building and operating production ML systems, or work experience equivalent
Strong experience developing in Python for machine learning systems, backend services, or distributed data processing
Proven experience deploying and operating ML workloads in cloud environments, including production-grade infrastructure
Solid understanding of model serving architectures, inference pipelines, and performance tradeoffs (latency, throughput, cost, scaling strategies)
Hands-on experience working with GPU-based workloads and accelerated computing in production settings
Experience designing CI/CD pipelines and development workflows that support reliable ML system deployment
Ability to independently scope and drive technical initiatives while balancing product and operational priorities
Strong problem-solving skills and the ability to debug performance and reliability issues in distributed systems
Clear and effective communication skills, with experience collaborating across engineering, product, and infrastructure teams

Job Responsibility

Design, build, and operate core AI platform components used to train, deploy, and serve machine learning models in production environments
Own model serving and inference workflows end-to-end, driving improvements in reliability, scalability, performance, and operational excellence
Lead efforts to optimize inference systems for throughput, latency, and cost efficiency across CPU and GPU workloads
Design and manage GPU-based inference and training workloads, including performance tuning, capacity planning, and resource utilization optimization
Own and improve critical parts of the model lifecycle, including packaging, versioning, testing strategies, validation, and deployment automation
Implement and evolve observability practices (metrics, logging, tracing, alerting) to improve visibility and operational resilience of ML services and pipelines
Partner closely with product, infrastructure, security, and data teams to design scalable platform capabilities that enable AI-powered features
Contribute to technical design discussions, propose architectural improvements, and mentor junior engineers through code reviews and knowledge sharing
Participate in and help improve operational processes, including incident response, on-call rotations, and post-incident reviews

What we offer

Generous performance-based bonus plans to all eligible employees
Rich medical, dental, and vision coverage
Generous retirement contributions with 100% immediate vesting (regardless of whether you contribute)
Quarterly all-company wellness days where everyone takes a pause together
Country specific holidays plus a day off for your birthday
One-time home office stipend
Annual professional development budget
Quarterly well-being stipend
Considerable paid parental leave
Employee referral bonus program

Fulltime

AI Platform Developer

The AI Developer - Platform is responsible for designing, building, and maintain...

Location

India , Bengaluru

Salary:

Not provided

Randstad

Expiration Date

August 03, 2026

Requirements

Bachelor's degree in Computer Science, Engineering, or related field
3+ years of experience in AI/ML development, with strong exposure to platform engineering or backend systems
Hands-on experience with Generative AI frameworks (OpenAI, Hugging Face, LangChain) and orchestration tools
Proficiency in Python and familiarity with API development (REST, GraphQL)
Experience with cloud platforms (AWS, Azure), containerisation (Docker, Kubernetes), and CI/CD pipelines
Knowledge of platform components such as model hosting, inference APIs, and vector databases
Strong problem-solving skills and ability to write clean, maintainable, and secure code
Familiarity with Responsible AI principles, Well-Architected Framework and enterprise security standards

Job Responsibility

Follow secure coding practices and ensure compliance with enterprise cybersecurity and Responsible AI principles
Contribute to reducing operational risk by implementing reliable and auditable platform components
Develop and maintain reusable AI platform services, APIs, and frameworks that enable rapid deployment of AI solutions
Implement scalable and modular components for model hosting, inference, and orchestration
Integrate Generative AI capabilities (e.g., LLMs, prompt orchestration, vector databases) into platform services
Ensure platform components meet performance, reliability, and observability standards
Collaborate with platform architects and AI Platform Lead to align development with enterprise architecture and platform strategy
Document technical designs, workflows, and deployment processes for knowledge sharing and compliance
Work closely with AI Platform Lead, AI Tech Chapter Lead, and other developers to ensure consistency in engineering practices
Engage with data engineering, cybersecurity, and cloud teams to ensure secure and compliant integrations

What we offer

Commitment to your ongoing development, including on the job opportunities and formal programs
Inclusive parental leave entitlements for both parents
Values led culture
Flexible work options
Generous annual leave, sick leave and casual leave
Cultural and religious leave with flexible public holiday opportunities
A competitive remuneration package featuring performance based incentives with uncapped Employer Provident Fund

Fulltime

Select Country

Inference Technical Lead

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?