CrawlJobs Logo

Sr. Deployment Engineer, AI Inference

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States; Canada , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. We are seeking a highly skilled and experienced Sr. Deployment Engineer to build and operate our cutting-edge inference clusters. These clusters would provide the candidate an opportunity to work with the world's largest computer chip, the Wafer-Scale Engine (WSE), and the systems that harness its unparalleled power. You will play a critical role in ensuring reliable, efficient, and scalable deployment of AI inference workloads across our global infrastructure.

Job Responsibility:

  • Deploy AI inference replicas and cluster software across multiple datacenters
  • Operate across heterogeneous datacenter environments undergoing rapid 10x growth
  • Maximize capacity allocation and optimize replica placement using constraint-solver algorithms
  • Operate bare-metal inference infrastructure while supporting transition to K8S-based platform
  • Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale
  • Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale
  • Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams
  • Stay up to date with the latest advancements in AI compute infrastructure and related technologies.

Requirements:

  • 5-7 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or in developing and managing complex AWS plane infrastructure for hybrid deployments
  • Strong proficiency in Python for automation, orchestration, and deployment tooling
  • Solid understanding of Linux-based systems and command-line tools
  • Extensive knowledge of Docker containers and container orchestration platforms like K8S
  • Familiarity with spine-leaf (Clos) networking architecture
  • Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana
  • Strong ownership mindset and accountability for complex deployments
  • Ability to work effectively in a fast-paced environment.
What we offer:
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs.

Additional Information:

Job Posted:
February 17, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Deployment Engineer, AI Inference

Sr. Distinguished AI Engineer

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , Cambridge, Massachusetts; New York, New York; Richmond, Virginia; San Jose, California; McLean, Virginia; San Francisco, California
Salary
Salary:
280600.00 - 384200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 10 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Sr. Lead AI Engineer

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
229900.00 - 286200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
  • At least 6 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Sr. Engineer, ML Platform

As the leading delivery platform in the region, we have a unique responsibility ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
deliveryhero.com Logo
Delivery Hero
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement
  • Fulltime
Read More
Arrow Right

Sr Machine Learning Engineer

About the Role
Location
Location
United States , Raleigh
Salary
Salary:
Not provided
bhsg.com Logo
Beacon Hill
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years experience building production-grade ML systems at scale
  • Strong LLM, Generative AI, RAG deployment experience
  • Expertise designed systems in cloud environments (AWS, Axure or GCP)
  • Hands-on work with Kubernetes, containerizaiton, and scalable inference systems
  • Experience designing agentic systems and tool orchestration frameworks
  • Ability to implement and govern MCP servers or structured architectures
  • Strong python background
Job Responsibility
Job Responsibility
  • Define reference architecture for LLM, ML, and agent-based systems across products
  • Design high-availability, low-latency inference platforms for global scale
  • Establish reusable platform components for model lifecycle, deployment, and monitoring
  • Architect multi-step, reasoning-driven agent systems
  • Design orchestration patterns for tool use, API invocation, and structured function calling
  • Lead implementation and governance of Model Context Protocol (MCP) servers to standardize tool integration and context management
  • Define guardrails, permissions, and audit mechanisms for enterprise-safe AI systems
  • Set best practices for MLOps, CI/CD, observability, and system reliability
  • Embed Responsible AI principles across platform architecture
  • Mentor senior engineers and influence technical direction across teams
Read More
Arrow Right

Sr Staff Engineer Software, Fullstack (Prisma AIRS) - NetSec

Join our team building a cutting-edge multi-tenanted GenAI Security Platform tha...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience building and scaling multi-tenant SaaS platforms with strict data isolation
  • Strong knowledge of API design, RESTful principles, and OpenAPI specifications
  • Proficiency in modern JavaScript frameworks (React, Vue, or Svelte) with TypeScript
  • Experience building data-intensive dashboards with complex visualisations and real-time data
  • Strong CSS/styling skills and responsive design principles
  • Demonstrated experience working with production AI/ML systems at scale
  • Practical experience integrating LLM APIs and managing inference at scale
  • Understanding of LLM operational challenges: rate limiting, cost optimisation, latency management, fallback strategies
  • Familiarity with AI agent frameworks (LangChain, AutoGen, MCP, or similar)
  • Knowledge of prompt engineering, semantic search, and vector databases
Job Responsibility
Job Responsibility
  • Design and implement high-performance REST APIs with enterprise-grade multi-tenant isolation and strict security boundaries
  • Work on distributed systems architecture handling high-throughput workloads with mission-critical uptime requirements
  • Build responsive dashboards and administrative interfaces for platform management, data visualisation, and system configuration
  • Integrate multiple LLM providers, implement semantic search capabilities, and build intelligent agent workflows
  • Architect complex, multi-step AI evaluation pipelines for asynchronous job execution and large-scale data processing
  • Design and implement database schemas with proper indexing, query optimisation, and data isolation strategies
  • Build and maintain scalable micro-services with async/await patterns and type-safe code
  • Develop data-intensive UIs with real-time updates, complex state management, and intuitive user experiences
  • Deploy and manage containerised applications on Kubernetes with comprehensive observability
  • Write thorough tests (frontend and backend) and maintain high code quality standards with automated tooling
  • Fulltime
Read More
Arrow Right

Sr Data Scientists

Sr Data Scientists is located in Frisco, TX and will support teams’ mission to p...
Location
Location
United States , Frisco
Salary
Salary:
141773.00 - 155000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Mathematics, Statistics, Economics, Computer Science, Physics, Electronic Engineering, or related, and 5 years of relevant work experience
  • Master’s degree in Mathematics, Statistics, Economics, Computer Science, Physics, Electronic Engineering, or related, and 3 years of relevant work experience
  • Experience in developing and deploying predictive models, advanced machine learning, deep learning, NLP, and generative AI solutions by applying a wide range of algorithms
  • Experience in developing solutions using Python, PySpark, SQL, and R, with libraries LangChain, LangGraph, Keras, Pandas, NumPy, SciPy, Matplotlib, and Scikit-Learn
  • Experience in working with data querying, wrangling, cleaning, and feature engineering across relational and non-relational databases: SQL, Snowflake, and Redshift in big data environments: Azure, AWS, and GCP, and leveraging Spark, Hadoop, Hive, and Kafka
  • Experience in building CI/CD pipelines, automating training and retraining workflows, deploying inference services, and monitoring ML algorithms in production environments in Databricks using tools: MLflow, and cloud-native services
  • Experience in articulating and reframing business problems, applying statistical and advanced analytics techniques in Python, R, and SQL, and leveraging SciPy, Scikit-Learn, and PySpark to generate actionable insights and recommendations
  • Experience in delivering impactful, data-driven presentations and effectively communicating machine learning and analytical concepts to technical teams, business stakeholders, and senior leadership, supported by visualizations created in Tableau, Power BI, Matplotlib, and Seaborn
  • At least 18 years of age
  • Legally authorized to work in the United States
Job Responsibility
Job Responsibility
  • Support business partners and product owners to understand business challenges, develop business cases, capture requirements, co-create solutions that drive business change that solve the challenges and deliver impactful business outcomes
  • Provide senior-level guidance and mentorship to the data science team, including reviewing projects, models, and code for peers and junior team members
  • Design advanced analytics to solve business problems
  • preprocess and perform exploratory data analysis on structured and unstructured data
  • create features based on expertise in the domain
  • use predictive modeling techniques and statistical analysis to predict outcomes and behaviors
  • Leverage the Agile methodology to ensure alignment of data science roadmap, features, and stories to business priorities and value streams
  • Collaborate with cross-functional team comprised of other data scientists, data engineers, ML engineers, and data analysts
  • Partner with other technology partners such as architects, engineers, product managers, scrum masters, release train engineers, and agile coaches to deliver on targeted business outcomes
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Annual bonus or periodic sales incentive or bonus
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Fulltime
Read More
Arrow Right

Assistant sales manager

We are looking for a talented Assistant Sales Manager for a established MNC that...
Location
Location
Malaysia , Klang
Salary
Salary:
8000.00 - 10000.00 MYR / Month
https://www.randstad.com Logo
Randstad
Expiration Date
May 04, 2026
Flip Icon
Requirements
Requirements
  • Bachelor’s degree/ diploma or equivalent sales training or experience
  • Availability for regular business travel
  • Extensive business-to-business sales experience, preferably in similar industrial packaging industries
  • Excellent planning & organizational skills
Job Responsibility
Job Responsibility
  • Drive Regional Growth: Lead sales expansion in Northern Malaysia by securing new business, exploring untapped markets, and winning large-scale strategic projects
  • Strategic Account Management: Manage and grow existing global accounts by collaborating with international sales teams to execute long-term development plans
  • Consultative Solution Selling: Identify customer needs and pitch tailored packaging solutions, ranging from traditional crates to sustainable, "green" materials
  • End-to-End Project Leadership: Navigate the full sales cycle—from initial consultation and quotation to negotiation and final project execution
  • Operational & ESG Support: Oversee critical business functions including CRM pipeline management, ERP implementation, and the promotion of eco-friendly packaging initiatives
  • Financial & Team Stewardship: Ensure healthy cash flow through diligent debt management while coaching junior team members to enhance regional performance
What we offer
What we offer
  • bonus
  • travelling & mobile allowances
  • Excellent employee compensation and benefits
Read More
Arrow Right

Product Owner

The Product Owner is responsible for defining the product vision, prioritizing t...
Location
Location
United States , Springfield, VA
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience as a Product Owner, Product Manager, or similar role in an Agile environment
  • Strong understanding of Agile methodologies (Scrum, Kanban) and experience working with development teams
  • Ability to translate complex business requirements into clear, actionable user stories
  • Excellent communication, facilitation, and stakeholder‑management skills
  • Analytical mindset with experience using data to inform decisions
  • Familiarity with product management tools (Jira, Azure DevOps, Aha!, or similar)
  • Experience in healthcare, technology, or other regulated industries is a plus
  • Strong problem‑solving skills and the ability to make decisions in a fast‑paced environment
Job Responsibility
Job Responsibility
  • Develop and communicate a clear product vision, strategy, and roadmap aligned with organizational goals
  • Own and prioritize the product backlog, ensuring clarity, feasibility, and alignment with business value
  • Collaborate closely with cross‑functional teams—including engineering, UX, business stakeholders, and QA—to deliver high‑quality product increments
  • Gather, analyze, and translate business needs into detailed user stories, acceptance criteria, and functional requirements
  • Serve as the primary point of contact for product decisions, providing direction and removing roadblocks for the development team
  • Conduct backlog grooming, sprint planning, and review sessions to ensure efficient Agile delivery
  • Evaluate product performance using data, analytics, and user feedback to drive continuous improvement
  • Partner with stakeholders to define KPIs, measure outcomes, and adjust priorities based on insights
  • Ensure the product meets compliance, security, and regulatory standards (especially relevant in healthcare or other regulated industries)
  • Advocate for the user, ensuring solutions are intuitive, accessible, and aligned with customer needs
What we offer
What we offer
  • medical
  • vision
  • dental
  • life and disability insurance
  • 401(k) plan
Read More
Arrow Right