CrawlJobs Logo

Senior ML Ops Engineer - Architecture & Strategy

bmw.de Logo

BMW

Location Icon

Location:
Germany , Munich

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We own the platform blueprint for our ML infrastructure: designing systems that integrate with a data mesh of domain-owned data products, leverage Qualcomm Cloud AI 100 and NVIDIA GPU clusters for training at petabyte scale and produce optimised model artefacts ready for deployment to vehicle hardware. We set technical direction, make build-vs-buy decisions, and ensure the platform scales to hundreds of engineers.

Job Responsibility:

  • You design the reference architecture for the ML platform end-to-end: data ingestion, PB-scale data lake, heterogeneous training clusters, model registry, and deployment-ready artefacts
  • You design the data-format backbone, setting standards for data flows, ingestion, cataloguing, transcoding, and partitioning at PB scale, integrated with dataset management tooling
  • You define the platform component topology and integration contracts for pipeline orchestration, experiment tracking, hyperparameter optimisation, dataset management, observability, and metadata
  • You establish model lifecycle governance, including experiment tracking, approval gates, validation criteria, and clear handoff contracts to deployment teams
  • You drive cost governance at PB scale, including accelerator spot strategies, S3 tiering, cross-AZ traffic reduction, and Kubernetes cluster right-sizing
  • You partner with Security, Legal, and Functional-Safety teams on ISO 26262, ISO 8800, and data-protection compliance

Requirements:

  • University degree in Computer Science, Computer/Electrical Engineering or related subjects
  • 5–8+ years in ML platform or infrastructure engineering, with at least two years in a tech lead or architect role
  • Deep expertise in either AWS, Azure or Google cloud, ideally with multi-region or multi-account setups
  • Proven track record designing systems for PB-scale data and hundreds of concurrent training jobs as well as understanding of large vision models and the challenges of compressing them for automotive-grade SoCs
  • Strong knowledge of Kubernetes platform design, GitOps, and infrastructure-as-code
  • Excellent communication skills to align ML researchers, embedded engineers, data teams, and executives
  • Familiarity with edge model compilation toolchains for Qualcomm (QNN, AIMET) and/or NVIDIA (TensorRT, Triton) and experience with automotive data at scale, such as MDF4, MCAP, ROS bags, and multi-sensor synchronisation
What we offer:
  • Challenging projects with which we shape the mobility of tomorrow together
  • Wide range of personal and professional development opportunities
  • Attractive, fair and performance-related remuneration
  • High level of job security
  • Annual special payments such as vacation pay, Christmas bonus, and profit sharing
  • Flexible working hours including six weeks annual leave and overtime compensation
  • Discounted BMW & MINI conditions
  • Many other benefits at bmw.jobs/benefits

Additional Information:

Job Posted:
March 21, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior ML Ops Engineer - Architecture & Strategy

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Principal Technical Program Manager - ML Platform

Location
Location
Salary
Salary:
231300.00 - 301975.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience on software teams as Development Manager, Technical Product Manager or TPM leading technical platforms areas
  • Deep domain experience in AI and/or Search. Example: Model Inference, Model Evaluation, Model Training, LLM Ops, Semantic Search, Search Relevance, etc.
  • Partner with Engineering in defining direction, strategy and execution at Platform level
  • Strategic thinking and ability to understand business objectives to translate them into technical problems and programs.
  • Technical understanding of systems involved. Willingness to develop domain expertise in the area they operate - storage, networking, authentication, capacity management, service deployments, etc.
  • TPMs are not expected to write or read code, but are expected to understand system flows, block architectures, APIs and such.
  • Experience defining and running end-to-end complex technical programs
  • Strong leadership, organizational, and communication skills
Job Responsibility
Job Responsibility
  • Understand and stay up-to-date on latest innovations in AI and Search. Partner closely with engineering teams to translate these into practical platform evolution for Atlassian bringing value to our customers.
  • Analyze business objectives, customer needs, product adoption inhibitors and opportunities, industry trends, and based on these, in close collaboration with your stakeholders, define a long-term strategy and roadmap for your platform and product components.
  • Understand business objectives and translate them into technical systems problems that need to be prioritized solved in the current business environment.
  • Define specific systems programs and create a plan of action for realizing those programs. Such programs could be around capacity planning, migration efforts, high availability, network architecture, performance optimization, reliability improvements and more.
  • Use your technical understanding of Atlassian and related systems to partner with and influence engineers and architects in making progress on these problems.
  • Responsible for taking a systematic approach to engineering problems. This includes: prioritizing tasks, scoping out the project, defining objectives, and making consistent progress against each of these.
  • Be accountable for the success of these technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
  • Manage complex dependencies and projects with a broad scope across the company
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , New York
Salary
Salary:
190800.00 - 286800.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
  • Fulltime
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right

Principal Data And Analytics Engineer

The Principal Data and Analytics Engineer holds comprehensive responsibility for...
Location
Location
United States
Salary
Salary:
108086.00 - 180144.00 USD / Year
oreillyauto.com Logo
O'Reilly Auto Parts
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience architecting enterprise-scale data platforms and ecosystems, including hybrid and cloud-native environments (e.g., GCP BigQuery, Snowflake, Iceberg, Advanced SQL, Erwin, dbt, Kafka, Alation, Collibra)
  • Deep expertise in designing and scaling highly available, secure, and fault-tolerant batch and streaming pipelines with strong emphasis on cost optimization, observability, and latency control
  • Advanced proficiency in semantic modeling, reusable data asset design, and cross-functional data product delivery aligned to medallion architecture
  • Leadership in implementing CI/CD-enabled pipelines, RBAC frameworks, schema evolution strategies, and interoperable data exchange using Iceberg or equivalent table formats
  • Ownership of organization-wide metrics store and semantic layers, ensuring consistency, governance, and performance across reporting, AI, and ML use cases
  • Advanced expertise in programming languages such as Python, Scala, with the ability to architect complex data solutions
  • Demonstrated leadership in designing and overseeing the implementation of scalable, idempotent workflows using orchestration frameworks such as Airflow and Prefect
  • Demonstrated ability to translate business transformation goals into scalable data solutions and reusable patterns
  • Deep understanding of business processes, KPIs, and capability maps across functions such as supply chain, customer, store ops, and finance
  • Proven experience in driving cross-functional data product prioritization, influencing senior stakeholders, and quantifying impact of data initiatives
Job Responsibility
Job Responsibility
  • Help define and evolve enterprise data engineering blueprints, including data mesh, medallion architecture, and hybrid cloud data platforms
  • Set strategic direction for data platforms, tools, and services (e.g., Snowflake, GCP BigQuery, dbt, Kafka, Airflow/Prefect) in alignment with future-state architecture and business priorities
  • Architect and design highly scalable, resilient, cost optimal and secure data platforms
  • Lead the design and implementation of next-generation data platforms, ensuring fault tolerance, high availability, and optimal performance for petabyte-scale data
  • Establish and enforce organization-wide best practices for data pipeline development, CI/CD for data workflows, automated deployment playbooks, and robust rollback strategies
  • Lead technology evaluation and adoption, proactively researching, evaluating, and championing the integration of cutting-edge data technologies, frameworks, and methodologies
  • Define and scale enterprise knowledge management frameworks that ensure consistent documentation, discoverability, and reusability of data assets across domains
  • Establish and govern standards for metadata management, data lineage, architectural diagrams, and runbooks
  • Lead the design of federated governance models that empower domain-aligned teams to operate autonomously while conforming to centralized policies, frameworks and playbooks
  • Collaborate with data governance, compliance, and security teams to operationalize policy-as-code frameworks for data retention, access control, and PII handling
What we offer
What we offer
  • Competitive Wages & Paid Time Off
  • Stock Purchase Plan & 401k with Employer Contributions Starting Day One
  • Medical, Dental, & Vision Insurance with Optional Flexible Spending Account (FSA)
  • Team Member Health/Wellbeing Programs
  • Tuition Educational Assistance Programs
  • Opportunities for Career Growth
  • Fulltime
Read More
Arrow Right

Senior Manager, AI Engineering

By leading the strategic adoption and scaling of AI across the organisation this...
Location
Location
United Kingdom , London OR Newbury
Salary
Salary:
Not provided
vodafone.com Logo
Vodafone
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience in AI strategy, delivery, and enablement
  • Strong understanding of GenAI, ML Ops, and AI governance
  • Familiarity with infrastructure provisioning and model lifecycle
  • Ability to influence cross-functional teams and stakeholders
  • Experience in training, consulting, and change management
  • Knowledge of privacy, security, and ethical AI practices
Job Responsibility
Job Responsibility
  • Define and deliver the AI strategy and roadmap
  • Build and maintain self-service AI environments and infrastructure
  • Implement use cases to demonstrate business value
  • Operate and monitor AI models for accuracy and performance
  • Collaborate with architecture, governance, and security teams
  • Establish best practice and enable reuse across solutions
  • Drive AI enablement through training and consulting
  • Evangelise AI adoption across internal and customer-facing teams
  • Monitor industry trends and pilot emerging opportunities
  • Measure and report on efficiency gains and impact
What we offer
What we offer
  • Great pay, bonuses, up to 28 days off plus bank holidays, and paid time for charity work
  • Personalise benefits for you and your family, like discounts, vouchers, a pension plan and loads more
  • Amazing learning tools and top-notch parental leave policies
  • Fulltime
Read More
Arrow Right
New

Pharmacy Technician

We’re building a world of health around every individual — shaping a more connec...
Location
Location
United States , Drexel Hill
Salary
Salary:
Not provided
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
June 22, 2026
Flip Icon
Requirements
Requirements
  • Must comply with any state board of pharmacy requirements or laws governing the practice of pharmacy, which includes but is not limited to, age, education, and licensure/certification
  • If the state board of pharmacy does not address or mandate a minimum age requirement, must be at least 16 years of age
  • If the state board of pharmacy does not address or mandate a minimum educational requirement, must have a high school diploma or equivalent, or be actively enrolled in high school or high school equivalency program
  • State-level licensure and national certification requirements vary by state, click here to learn more
  • Regular and predictable attendance, including nights and weekends
  • Ability to complete required training within designated timeframe
  • Attention and Focus
  • Customer Service and Team Orientation
  • Communication Skills
  • Mathematical Reasoning
Job Responsibility
Job Responsibility
  • Living our purpose by following all company SOPs at each workstation to help our Pharmacists manage and improve patient health
  • Following pharmacy workflow procedures at each pharmacy workstation (i.e., production, pick-up, drive-thru, and drop-off) for safe and accurate prescription fulfillment
  • Contributing to positive patient experiences by showing empathy and genuine care
  • Completing basic inventory activities, as permitted by law, and as directed by the pharmacy leadership team
  • Contributing to a high-performing team, embracing a growth mindset, and being receptive to feedback
  • Remaining flexible for both scheduling and business needs, while contributing to a safe, inclusive, and engaging team dynamic
  • Understanding and complying with all relevant federal, state, and local laws, regulations, professional standards, and ethical principles
  • Delivering additional patient health care services (e.g., immunizations, point-of-care testing, and voluntarily staffing offsite clinics), where allowable by law and supported by required training and certification
  • Where permissible, the Pharmacy Technician may also support immunizations, which includes the following responsibilities: Completing additional licensure and training requirements, in compliance with state Board of Pharmacy regulations, to obtain Technician Immunizer status to support preparing and administering vaccines
  • Educating patients about the importance of vaccines and referring patients to the Pharmacist-on-duty for vaccination questions
What we offer
What we offer
  • medical, dental, and vision coverage
  • paid time off
  • retirement savings options
  • wellness programs
  • and other resources, based on eligibility
  • Fulltime
Read More
Arrow Right
New

Mri Technologist

MedPro Healthcare Staffing, a Joint Commission-certified staffing agency, is see...
Location
Location
United States , Springfield
Salary
Salary:
Not provided
medprostaffing.com Logo
MedPro Healthcare Staffing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Completion of a two year approved School of Radiologic Technology
  • Valid state radiology registration as required by state law
  • Registry by the American Registry of Radiologic Technology.
Job Responsibility
Job Responsibility
  • MRI technologists utilize their knowledge of anatomy, physiology and the principles of MRI to safely and efficiently operate MRI scanners, assisting in the diagnosis of disease and injury.
  • Ensure the safety of patients, staff and visitors who come in contact with the powerful magnetic field of a MRI scanner.
  • Position patients and coils on a table that slides inside the MRI scanner.
  • Inject contrast media as required.
  • Set appropriate technical parameters, operate MRI scanners and related equipment, and observe image data on computer monitors during scans.
  • Be familiar with the differences from a normal image and an abnormal image.
  • Recognize and respond to life threatening situations.
  • Assure compliance with federal, state, and local technical and professional regulations and accepted practiced guidelines.
  • Delivers quality, cost effective patient care in a professional manner.
  • Works effectively to maintain an environment of excellence, which is patient focused, providing timely, compassionate, quality patient care.
What we offer
What we offer
  • Weekly pay and direct deposit
  • Full coverage of all credentialing fees
  • Private housing or housing allowance
  • Group Health insurance for you and your family
  • Company-paid life and disability insurance
  • Travel reimbursement
  • 401(k) matching
  • Unlimited Referral Bonuses up to $1,000
  • Fulltime
Read More
Arrow Right