CrawlJobs Logo

Sr. Deployment Engineer, AI Inference

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States; Canada , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. We are seeking a highly skilled and experienced Sr. Deployment Engineer to build and operate our cutting-edge inference clusters. These clusters would provide the candidate an opportunity to work with the world's largest computer chip, the Wafer-Scale Engine (WSE), and the systems that harness its unparalleled power. You will play a critical role in ensuring reliable, efficient, and scalable deployment of AI inference workloads across our global infrastructure.

Job Responsibility:

  • Deploy AI inference replicas and cluster software across multiple datacenters
  • Operate across heterogeneous datacenter environments undergoing rapid 10x growth
  • Maximize capacity allocation and optimize replica placement using constraint-solver algorithms
  • Operate bare-metal inference infrastructure while supporting transition to K8S-based platform
  • Develop and extend telemetry, observability and alerting solutions to ensure deployment reliability at scale
  • Develop and extend a fully automated deployment pipeline to support fast software updates and capacity reallocation at scale
  • Translate technical and customer needs into actionable requirements for the Dev Infra, Cluster, Platform and Core teams
  • Stay up to date with the latest advancements in AI compute infrastructure and related technologies.

Requirements:

  • 5-7 years of experience in operating on-prem compute infrastructure (ideally in Machine Learning or High-Performance Compute) or in developing and managing complex AWS plane infrastructure for hybrid deployments
  • Strong proficiency in Python for automation, orchestration, and deployment tooling
  • Solid understanding of Linux-based systems and command-line tools
  • Extensive knowledge of Docker containers and container orchestration platforms like K8S
  • Familiarity with spine-leaf (Clos) networking architecture
  • Proficiency with telemetry and observability stacks such as Prometheus, InfluxDB and Grafana
  • Strong ownership mindset and accountability for complex deployments
  • Ability to work effectively in a fast-paced environment.
What we offer:
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs.

Additional Information:

Job Posted:
February 17, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31694 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Deployment Engineer, AI Inference

Sr. Distinguished AI Engineer

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , Cambridge; New York; Richmond; San Jose; McLean; San Francisco
Salary
Salary:
286200.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 10 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Sr. Distinguished AI Engineer

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , Cambridge, Massachusetts; New York, New York; Richmond, Virginia; San Jose, California; McLean, Virginia; San Francisco, California
Salary
Salary:
280600.00 - 384200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 10 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • Fulltime
Read More
Arrow Right

Sr. Engineer, ML Platform

As the leading delivery platform in the region, we have a unique responsibility ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
deliveryhero.com Logo
Delivery Hero
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement
  • Fulltime
Read More
Arrow Right

Sr. Cloud Infrastructure Engineer (Ai & Llm Platforms)

We are seeking a specialized Infrastructure Engineer to bridge the gap between o...
Location
Location
Salary
Salary:
Not provided
q6cyber.com Logo
Q6 Cyber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in DevOps, Platform Engineering, or SRE, with at least 1-2 years specifically focused on AI/ML infrastructure
  • Proven track record of building production-grade RAG pipelines or LLM-integrated applications
  • Thrives in 'day zero' environments where the tools and protocols (like MCP) are evolving weekly
  • Deep understanding of the security implications of LLMs (prompt injection, data leakage, and secure tool execution)
  • Experience working with substantial datasets (over 1bn objects, dozens or hundreds of TBs) and the challenges of leveraging AI tools with these data sets
  • Bachelor's degree or equivalent in computer science or related field
  • Cloud & Orchestration: AWS/GCP/Azure, Kubernetes, Terraform, Helm
  • AI Frameworks: LangChain, LlamaIndex, LangGraph
  • Data & Vectors: Pinecone, Milvus, Qdrant, or pgvector
  • Apache Kafka/Pulsar
Job Responsibility
Job Responsibility
  • Guide the architecture that will allow us to leverage AI tools with our large existing data stores and incoming streams of realtime intelligence
  • Work closely with other infrastructure engineers and software development teams to integrate AI tools into existing systems
  • Design, deploy, and maintain Model Context Protocol (MCP) servers to allow LLMs to securely interact with our internal databases, APIs, and external tooling
  • Build and orchestrate sandboxed, scalable environments (e.g., using Docker or specialized runtimes) where users can safely build and execute AI agents
  • Develop and manage the infrastructure for our internal RAG (Retrieval-Augmented Generation) pipeline, including vector database management (e.g., Pinecone, Weaviate, or pgvector) and automated embedding pipelines
  • Utilize Kubernetes (K8s) and Infrastructure as Code (Terraform/Pulumi) to deploy LLM-related tools, ensuring high availability and low latency for model inference and data retrieval
  • Implement strict guardrails for data privacy within LLM workflows, ensuring internal datasets remain secure while being accessible to authorized AI tools
What we offer
What we offer
  • We offer a competitive compensation package and comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Sr Staff Engineer Software, Fullstack (Prisma AIRS) - NetSec

Join our team building a cutting-edge multi-tenanted GenAI Security Platform tha...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience building and scaling multi-tenant SaaS platforms with strict data isolation
  • Strong knowledge of API design, RESTful principles, and OpenAPI specifications
  • Proficiency in modern JavaScript frameworks (React, Vue, or Svelte) with TypeScript
  • Experience building data-intensive dashboards with complex visualisations and real-time data
  • Strong CSS/styling skills and responsive design principles
  • Demonstrated experience working with production AI/ML systems at scale
  • Practical experience integrating LLM APIs and managing inference at scale
  • Understanding of LLM operational challenges: rate limiting, cost optimisation, latency management, fallback strategies
  • Familiarity with AI agent frameworks (LangChain, AutoGen, MCP, or similar)
  • Knowledge of prompt engineering, semantic search, and vector databases
Job Responsibility
Job Responsibility
  • Design and implement high-performance REST APIs with enterprise-grade multi-tenant isolation and strict security boundaries
  • Work on distributed systems architecture handling high-throughput workloads with mission-critical uptime requirements
  • Build responsive dashboards and administrative interfaces for platform management, data visualisation, and system configuration
  • Integrate multiple LLM providers, implement semantic search capabilities, and build intelligent agent workflows
  • Architect complex, multi-step AI evaluation pipelines for asynchronous job execution and large-scale data processing
  • Design and implement database schemas with proper indexing, query optimisation, and data isolation strategies
  • Build and maintain scalable micro-services with async/await patterns and type-safe code
  • Develop data-intensive UIs with real-time updates, complex state management, and intuitive user experiences
  • Deploy and manage containerised applications on Kubernetes with comprehensive observability
  • Write thorough tests (frontend and backend) and maintain high code quality standards with automated tooling
  • Fulltime
Read More
Arrow Right

Sr Data Scientists

Sr Data Scientists is located in Frisco, TX and will support teams’ mission to p...
Location
Location
United States , Frisco
Salary
Salary:
141773.00 - 155000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Mathematics, Statistics, Economics, Computer Science, Physics, Electronic Engineering, or related, and 5 years of relevant work experience
  • Master’s degree in Mathematics, Statistics, Economics, Computer Science, Physics, Electronic Engineering, or related, and 3 years of relevant work experience
  • Experience in developing and deploying predictive models, advanced machine learning, deep learning, NLP, and generative AI solutions by applying a wide range of algorithms
  • Experience in developing solutions using Python, PySpark, SQL, and R, with libraries LangChain, LangGraph, Keras, Pandas, NumPy, SciPy, Matplotlib, and Scikit-Learn
  • Experience in working with data querying, wrangling, cleaning, and feature engineering across relational and non-relational databases: SQL, Snowflake, and Redshift in big data environments: Azure, AWS, and GCP, and leveraging Spark, Hadoop, Hive, and Kafka
  • Experience in building CI/CD pipelines, automating training and retraining workflows, deploying inference services, and monitoring ML algorithms in production environments in Databricks using tools: MLflow, and cloud-native services
  • Experience in articulating and reframing business problems, applying statistical and advanced analytics techniques in Python, R, and SQL, and leveraging SciPy, Scikit-Learn, and PySpark to generate actionable insights and recommendations
  • Experience in delivering impactful, data-driven presentations and effectively communicating machine learning and analytical concepts to technical teams, business stakeholders, and senior leadership, supported by visualizations created in Tableau, Power BI, Matplotlib, and Seaborn
  • At least 18 years of age
  • Legally authorized to work in the United States
Job Responsibility
Job Responsibility
  • Support business partners and product owners to understand business challenges, develop business cases, capture requirements, co-create solutions that drive business change that solve the challenges and deliver impactful business outcomes
  • Provide senior-level guidance and mentorship to the data science team, including reviewing projects, models, and code for peers and junior team members
  • Design advanced analytics to solve business problems
  • preprocess and perform exploratory data analysis on structured and unstructured data
  • create features based on expertise in the domain
  • use predictive modeling techniques and statistical analysis to predict outcomes and behaviors
  • Leverage the Agile methodology to ensure alignment of data science roadmap, features, and stories to business priorities and value streams
  • Collaborate with cross-functional team comprised of other data scientists, data engineers, ML engineers, and data analysts
  • Partner with other technology partners such as architects, engineers, product managers, scrum masters, release train engineers, and agile coaches to deliver on targeted business outcomes
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Annual bonus or periodic sales incentive or bonus
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Fulltime
Read More
Arrow Right
New

Fintech Accounting Manager

We are looking for an experienced Accounting Manager ideally with specialty lend...
Location
Location
United States , New York
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of accounting experience, including leadership responsibility in month-end close and general ledger oversight
  • Strong background in account reconciliation, journal entry preparation, and financial statement audit support
  • Hands-on experience with budgeting, forecasting, and cash reconciliation in a fast-paced business setting
  • Proficiency with QuickBooks Enterprise, QuickBooks Online, and advanced Microsoft Excel functions
  • Working knowledge of commercial lending, consumer lending, credit analysis, and related financing structures
  • Familiarity with credit facilities, credit lines, lending products, and financing agreements
  • Ability to investigate variances, maintain accuracy across multiple accounts, and communicate findings clearly
Job Responsibility
Job Responsibility
  • Lead the monthly and period-end close cycle, ensuring deadlines are met and financial results are complete and accurate
  • Manage general ledger activity by reviewing entries, maintaining account integrity, and resolving discrepancies in a timely manner
  • Prepare and review journal entries, balance sheet reconciliations, and cash reconciliations to support reliable reporting
  • Coordinate audit support by organizing schedules, responding to auditor requests, and assisting with financial statement review activities
  • Develop forecasts and budgeting analyses that help leadership evaluate performance and plan for business needs
  • Analyze financial data tied to lending operations, including credit-related products, financing arrangements, and portfolio activity
  • Monitor account activity and perform detailed financial reconciliations to identify issues and improve reporting accuracy
  • Utilize QuickBooks Enterprise, QuickBooks Online, and Microsoft Excel to manage records, reporting, and account analysis
What we offer
What we offer
  • Medical, vision, dental, and life and disability insurance
  • 401(k) plan
Read More
Arrow Right
New

Data Entry/Administrator

Are you a highly organized project professional ready to support large-scale ret...
Location
Location
Canada , Paris
Salary
Salary:
24.00 - 26.00 CAD / Hour
https://www.randstad.com Logo
Randstad
Expiration Date
July 12, 2026
Flip Icon
Requirements
Requirements
  • Previous experience with retail fixtures (metal/wood), signage, wayfinding, or print production is considered a strong asset
  • Proficiency with project management tools, spreadsheets (Excel), and shared digital platforms
  • Ability to thrive in high-pressure environments with tight deadlines and shifting priorities
  • Exceptional communication skills with a clear understanding of how collaboration supports successful project execution
  • A positive, resourceful, and team-oriented approach to problem-solving
Job Responsibility
Job Responsibility
  • Oversee the lifecycle of retail fixture, graphic, and wayfinding projects, ensuring all deliverables align with specific brand and client requirements
  • Maintain and update project trackers, schedules, and approvals to keep all stakeholders informed
  • Collaborate with internal production and installation teams to ensure all phases stay on schedule and meet technical specifications
  • Coordinate store installations and related logistics, ensuring quality standards are maintained during the final rollout
  • Support invoice preparation and project close-outs with Finance, while processing dockets, receipts, and inventory using internal systems
  • Provide clear, timely updates to clients and manage data integrity within various client portals
  • Contribute to the standardization of workflows for recurring project types to increase departmental efficiency
What we offer
What we offer
  • Contract with potential of becoming full time permanent
  • 8:30am-4:30pm this is flexible if needed
  • 30 minute lunch
Read More
Arrow Right