CrawlJobs Logo

Sr. Deployment Engineer, AI Inference

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States; Canada , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. We are seeking a highly skilled and experienced Sr. Deployment Engineer to build and operate our cutting-edge inference clusters. These clusters would provide the candidate an opportunity to work with the world's largest computer chip, the Wafer-Scale Engine (WSE), and the systems that harness its unparalleled power. You will play a critical role in ensuring reliable, efficient, and scalable deployment of AI inference workloads across our global infrastructure.

Job Responsibility:

  • Deploy AI inference replicas and cluster software across multiple datacenters
  • Operate across heterogeneous datacenter environments undergoing rapid 10x growth
  • Maximize capacity allocation and optimize replica placement using constraint-solver algorithms
  • Operate bare-metal inference infrastructure while supporting transition to K8S-based platform
  • Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale
  • Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale
  • Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams
  • Stay up to date with the latest advancements in AI compute infrastructure and related technologies.

Requirements:

  • 5-7 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or in developing and managing complex AWS plane infrastructure for hybrid deployments
  • Strong proficiency in Python for automation, orchestration, and deployment tooling
  • Solid understanding of Linux-based systems and command-line tools
  • Extensive knowledge of Docker containers and container orchestration platforms like K8S
  • Familiarity with spine-leaf (Clos) networking architecture
  • Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana
  • Strong ownership mindset and accountability for complex deployments
  • Ability to work effectively in a fast-paced environment.
What we offer:
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs.

Additional Information:

Job Posted:
February 17, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Deployment Engineer, AI Inference

Sr. Distinguished AI Engineer

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , Cambridge, Massachusetts; New York, New York; Richmond, Virginia; San Jose, California; McLean, Virginia; San Francisco, California
Salary
Salary:
280600.00 - 384200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 10 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Sr. Lead AI Engineer

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , New York; San Francisco; San Jose; Cambridge; McLean
Salary
Salary:
229900.00 - 286200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
  • At least 6 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Sr. Engineer, ML Platform

As the leading delivery platform in the region, we have a unique responsibility ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
deliveryhero.com Logo
Delivery Hero
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement
  • Fulltime
Read More
Arrow Right

Sr Machine Learning Engineer

About the Role
Location
Location
United States , Raleigh
Salary
Salary:
Not provided
bhsg.com Logo
Beacon Hill
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years experience building production-grade ML systems at scale
  • Strong LLM, Generative AI, RAG deployment experience
  • Expertise designed systems in cloud environments (AWS, Axure or GCP)
  • Hands-on work with Kubernetes, containerizaiton, and scalable inference systems
  • Experience designing agentic systems and tool orchestration frameworks
  • Ability to implement and govern MCP servers or structured architectures
  • Strong python background
Job Responsibility
Job Responsibility
  • Define reference architecture for LLM, ML, and agent-based systems across products
  • Design high-availability, low-latency inference platforms for global scale
  • Establish reusable platform components for model lifecycle, deployment, and monitoring
  • Architect multi-step, reasoning-driven agent systems
  • Design orchestration patterns for tool use, API invocation, and structured function calling
  • Lead implementation and governance of Model Context Protocol (MCP) servers to standardize tool integration and context management
  • Define guardrails, permissions, and audit mechanisms for enterprise-safe AI systems
  • Set best practices for MLOps, CI/CD, observability, and system reliability
  • Embed Responsible AI principles across platform architecture
  • Mentor senior engineers and influence technical direction across teams
Read More
Arrow Right

Sr Staff Engineer Software, Fullstack (Prisma AIRS) - NetSec

Join our team building a cutting-edge multi-tenanted GenAI Security Platform tha...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience building and scaling multi-tenant SaaS platforms with strict data isolation
  • Strong knowledge of API design, RESTful principles, and OpenAPI specifications
  • Proficiency in modern JavaScript frameworks (React, Vue, or Svelte) with TypeScript
  • Experience building data-intensive dashboards with complex visualisations and real-time data
  • Strong CSS/styling skills and responsive design principles
  • Demonstrated experience working with production AI/ML systems at scale
  • Practical experience integrating LLM APIs and managing inference at scale
  • Understanding of LLM operational challenges: rate limiting, cost optimisation, latency management, fallback strategies
  • Familiarity with AI agent frameworks (LangChain, AutoGen, MCP, or similar)
  • Knowledge of prompt engineering, semantic search, and vector databases
Job Responsibility
Job Responsibility
  • Design and implement high-performance REST APIs with enterprise-grade multi-tenant isolation and strict security boundaries
  • Work on distributed systems architecture handling high-throughput workloads with mission-critical uptime requirements
  • Build responsive dashboards and administrative interfaces for platform management, data visualisation, and system configuration
  • Integrate multiple LLM providers, implement semantic search capabilities, and build intelligent agent workflows
  • Architect complex, multi-step AI evaluation pipelines for asynchronous job execution and large-scale data processing
  • Design and implement database schemas with proper indexing, query optimisation, and data isolation strategies
  • Build and maintain scalable micro-services with async/await patterns and type-safe code
  • Develop data-intensive UIs with real-time updates, complex state management, and intuitive user experiences
  • Deploy and manage containerised applications on Kubernetes with comprehensive observability
  • Write thorough tests (frontend and backend) and maintain high code quality standards with automated tooling
  • Fulltime
Read More
Arrow Right

Sr Data Scientists

Sr Data Scientists is located in Frisco, TX and will support teams’ mission to p...
Location
Location
United States , Frisco
Salary
Salary:
141773.00 - 155000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Mathematics, Statistics, Economics, Computer Science, Physics, Electronic Engineering, or related, and 5 years of relevant work experience
  • Master’s degree in Mathematics, Statistics, Economics, Computer Science, Physics, Electronic Engineering, or related, and 3 years of relevant work experience
  • Experience in developing and deploying predictive models, advanced machine learning, deep learning, NLP, and generative AI solutions by applying a wide range of algorithms
  • Experience in developing solutions using Python, PySpark, SQL, and R, with libraries LangChain, LangGraph, Keras, Pandas, NumPy, SciPy, Matplotlib, and Scikit-Learn
  • Experience in working with data querying, wrangling, cleaning, and feature engineering across relational and non-relational databases: SQL, Snowflake, and Redshift in big data environments: Azure, AWS, and GCP, and leveraging Spark, Hadoop, Hive, and Kafka
  • Experience in building CI/CD pipelines, automating training and retraining workflows, deploying inference services, and monitoring ML algorithms in production environments in Databricks using tools: MLflow, and cloud-native services
  • Experience in articulating and reframing business problems, applying statistical and advanced analytics techniques in Python, R, and SQL, and leveraging SciPy, Scikit-Learn, and PySpark to generate actionable insights and recommendations
  • Experience in delivering impactful, data-driven presentations and effectively communicating machine learning and analytical concepts to technical teams, business stakeholders, and senior leadership, supported by visualizations created in Tableau, Power BI, Matplotlib, and Seaborn
  • At least 18 years of age
  • Legally authorized to work in the United States
Job Responsibility
Job Responsibility
  • Support business partners and product owners to understand business challenges, develop business cases, capture requirements, co-create solutions that drive business change that solve the challenges and deliver impactful business outcomes
  • Provide senior-level guidance and mentorship to the data science team, including reviewing projects, models, and code for peers and junior team members
  • Design advanced analytics to solve business problems
  • preprocess and perform exploratory data analysis on structured and unstructured data
  • create features based on expertise in the domain
  • use predictive modeling techniques and statistical analysis to predict outcomes and behaviors
  • Leverage the Agile methodology to ensure alignment of data science roadmap, features, and stories to business priorities and value streams
  • Collaborate with cross-functional team comprised of other data scientists, data engineers, ML engineers, and data analysts
  • Partner with other technology partners such as architects, engineers, product managers, scrum masters, release train engineers, and agile coaches to deliver on targeted business outcomes
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Annual bonus or periodic sales incentive or bonus
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Fulltime
Read More
Arrow Right

Senior Category Manager – Core Pantry

We are seeking an experienced Sr. Category Manager to own merchandising efforts ...
Location
Location
United States , Los Angeles
Salary
Salary:
110000.00 - 130000.00 USD / Year
remotivatejobs.com Logo
RemotivateJobs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in a buying or category management role
  • E-commerce experience highly preferred
  • Strong negotiation and vendor relationship skills
  • Deep knowledge of consumer trends and behavior, as well as ingredients and Non-GMO certifications
  • Proven track record of over delivering on KPIs and financial plans
  • Fluency in technical and tactical merchandising: Excel, Power Point, Google Sheets expertise critical
  • Willingness to roll your sleeves up and get into the day to day of category management and item launches
  • Exemplary communication skills: you can present, write clear/concise emails, and understand the nuances of how best to communicate with different teams and vendors
  • Self-directing and self-correcting: you’re a natural problem solver skilled at prioritization
  • Flexibility and ability to adapt quickly to an ever-evolving environment
Job Responsibility
Job Responsibility
  • Grow and maintain the Core Pantry categories to meet expansion goals including revenue, margin, and other KPIs while at the same time balancing our ingredient/quality standards
  • Drive category strategy and innovation, including best in class shopping experience for Core Pantry
  • Actively identify, seek out, and cultivate strong vendor relationships with large volume vendors – negotiate the best pricing and margin terms, promotional support, maintain consistent and clear communication, and be a strong representative of Thrive Market’s brand and values in every interaction
  • Review and vet every brand and product that comes through the door to ensure that their ingredients and sourcing practices meet our quality standards
  • Track, measure, and analyze data and report on KPIs and metrics to identify successes and challenges within category performance, action on areas for improvement, and inform new brand/product opportunities
  • Negotiate with partners to secure the most competitive everyday value pricing for our members, accruals, marketing & promotional budgets, rebate growth incentives, delivery, and payment terms
  • Drive the new product/brand vetting and onboarding process to ensure timely addition to market
  • Collaborate closely with Merchandising Marketing team to develop and execute annual promotional strategies for brands and categories
  • Partner with Procurement team on inventory forecasts, new item projections, vendor fulfillment rates, expiry risks, and exiting out of slow moving inventory
  • Be curious, experimental, and rigorous in your research to make sure our assortment at Thrive Market is on-trend and setting the bar for new products. Know the competitive landscape inside-out
What we offer
What we offer
  • Comprehensive health benefits (medical, dental, vision, life and disability)
  • Competitive salary (DOE) + equity
  • 401k plan
  • 9 Observed Holidays
  • Flexible Paid Time Off
  • Subsidized ClassPass Membership with access to fitness classes and wellness and beauty experiences
  • Ability to work in our beautiful office in Playa Vista
  • Free Thrive Market membership with exclusive employee discount
  • Coverage for Life Coaching & Therapy Sessions on our holistic mental health and well-being platform
  • Fulltime
Read More
Arrow Right

Plumber

Plumber required for repairs and maintenance to work for a local authority in th...
Location
Location
United Kingdom , Gateshead
Salary
Salary:
25.00 - 26.00 GBP / Hour
https://www.randstad.com Logo
Randstad
Expiration Date
April 08, 2026
Flip Icon
Requirements
Requirements
  • Valid CSCS card (Essential)
  • Driving licence (Essential)
  • Experience working for local authority or social housing provider in tenanted houses
Job Responsibility
Job Responsibility
  • Repairs and maintenance
  • Kitchen / Bathroom refurbs in tenanted social housing properties
What we offer
What we offer
  • A competitive pay rate (CIS, PAYE or Umbrella)
  • Opportunity for ongoing work
  • Access to Randstad's training department
!
Read More
Arrow Right