CrawlJobs Logo

Principal AI Tooling Engineer

United States, Santa Clara Employment contract 147000.00 - 237500.00 USD / Year · Job Posted May 29, 2026
Apply Position
Job Link Share

Job Description

Design and develop testing frameworks for AI/ML models and LLM applications. Build automated pipelines for model validation, regression testing, and benchmarking. Create evaluation datasets, synthetic data, and test scenarios for edge cases. Implement metrics to assess accuracy, robustness, latency, and safety. Develop tools for prompt testing, output validation, and hallucination detection. Collaborate with engineers, and product teams to define test strategies. Monitor model performance in production and build alerting systems. Ensure compliance with ethical AI standards, fairness, and bias testing. Debug model behavior and identify root causes of failures.

Job Responsibility

  • Design and develop testing frameworks for AI/ML models and LLM applications
  • Build automated pipelines for model validation, regression testing, and benchmarking
  • Create evaluation datasets, synthetic data, and test scenarios for edge cases
  • Implement metrics to assess accuracy, robustness, latency, and safety
  • Develop tools for prompt testing, output validation, and hallucination detection
  • Collaborate with engineers, and product teams to define test strategies
  • Monitor model performance in production and build alerting systems
  • Ensure compliance with ethical AI standards, fairness, and bias testing
  • Debug model behavior and identify root causes of failures

Requirements

  • Bachelor's or Master's degree in Computer Science, AI, Machine Learning, or related field
  • 8+ years of experience in software engineering, QA automation, or ML engineering
  • Strong programming skills in Python (preferred) or similar languages
  • Experience with testing frameworks (e.g., PyTest, unittest)
  • Familiarity with machine learning concepts and model evaluation techniques
  • Experience working with APIs, distributed systems, and CI/CD pipelines
  • Knowledge of data structures, algorithms, and software design principles

Nice to have

  • Experience with LLMs and prompt engineering
  • Familiarity with evaluation tools like LangChain, OpenAI Evals, or similar frameworks
  • Knowledge of AI safety, bias detection, and adversarial testing
  • Experience with cloud platforms (AWS, GCP, and Azure)
  • Understanding of observability tools and monitoring systems
  • Exposure to synthetic data generation and simulation environments

What we offer

  • restricted stock units
  • bonus
  • employee benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal AI Tooling Engineer

8 matching positions

Principal Product Manager, Gen AI Developer Tools

Anaconda is seeking a talented Principal Product Manager, GenAI Developer Tools ...
Location
Location
United States
Salary
Salary:
162500.00 - 282000.00 USD / Year
anaconda.com Logo
Anaconda
Expiration Date
July 01, 2026
Flip Icon
Requirements
Requirements
  • 7+ years of product management experience
  • at least 3 years focused on developer tools, infrastructure, or platform products
  • deep, hands-on experience with AI coding tools (Cursor, GitHub Copilot, Claude, etc.)
  • proven expertise in MCP (Model Context Protocol) development, A2A protocol, agentic systems, or similar AI-to-application integration technologies
  • ability to work closely with engineering teams on API design, system architecture, and implementation trade-offs
  • experience building and managing strategic technology partnerships, particularly with developer tool companies
  • track record of driving product adoption in developer communities
  • excellent written and verbal communication skills
  • deep understanding of Python ecosystem, package management, and enterprise software development workflows
  • Bachelor's degree in Computer Science, Engineering, or related field
Job Responsibility
Job Responsibility
  • Lead product strategy and roadmap for Anaconda's Agentic Environment & Package Management initiatives, including MCP (Model Context Protocol) server development and AI tool integrations
  • conduct strategic research with enterprise customers, AI tool vendors, and the open-source community
  • drive the development and launch of tools and agents that improve the quality of Python vibe-coding
  • support strategic partnerships with top AI-enabled IDEs (Cursor, VS Code, Windsurf, etc.)
  • define and execute on integration strategies that make Anaconda indispensable to AI coding workflows
  • collaborate with engineering teams to build robust APIs, SDKs, and developer tools
  • work closely with our enterprise customers in regulated industries (finance, healthcare, government) to ensure our AI infrastructure meets their security and compliance requirements
  • partner with marketing and developer relations to drive adoption among the 50+ million Python developers globally
  • measure and optimize key metrics including developer adoption, enterprise package downloads, and AI tool integration usage
  • represent Anaconda at industry conferences, with partners, and in the broader AI/ML community
What we offer
What we offer
  • Flexible Vacation Policy
  • Medical, Dental, and Vision Insurance
  • Short Term and Long Term Disability
  • Paid Parental Leave
  • Monthly Wellness Stipend
  • Employee Assistance Program and Mental Health Resources
  • annual bonus potential
  • equity participation
  • Fulltime
Read More
Arrow Right

Principal Site Reliability Engineer (AI-first SRE)

Groupon is modernizing its global platform — and reliability is at the center of...
Location
Location
Peru
Salary
Salary:
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years in software/systems engineering, including 5+ years in SRE or platform reliability
  • Strong experience with GCP (preferred) or AWS, Kubernetes, and Terraform
  • Proficiency in Python or Go for automation and tooling
  • Deep understanding of observability stacks (Prometheus, Grafana, OpenTelemetry) and service meshes (Istio, Envoy)
  • Hands-on AIOps experience: anomaly detection, predictive analytics, ML-assisted operations
  • Strong communication and influencing skills — data over hierarchy
Job Responsibility
Job Responsibility
  • Architect and maintain self-healing systems with 99.9%+ availability targets
  • Use AI/ML to automate infrastructure governance and detect configuration or IaC anti-patterns
  • Implement adaptive SLIs/SLOs that evolve automatically from real-time data
  • Build AIOps-based observability and auto-remediation pipelines
  • Apply predictive modeling to forecast failures before they impact users
  • Lead chaos, performance, and resilience testing programs
  • Map platform and service behavior to revenue impact and drive improved revenue resilience through better infrastructure performance
  • Mentor engineers and drive reliability standards across teams
  • Partner with platform, data, and product teams to ensure stability aligns with business goals
  • Support major incident response, incident review, and participate in on-call rotations
What we offer
What we offer
  • The opportunity to work with cutting-edge technologies in a transformative environment
  • Professional growth and leadership development pathways tailored to your aspirations
  • A chance to leave a lasting impact by shaping the future of reliable and scalable systems
Read More
Arrow Right

Principal QA Automation Engineer w/ AI experience

We seek a Principal QA Automation Engineer with a strong background in Cypress, ...
Location
Location
Argentina , Gran Buenos Aires; Capital Federal; Mar del Plata
Salary
Salary:
Not provided
basicagency.com Logo
BASIC/DEPT®
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of hands-on experience with Cypress for UI and end-to-end automation testing
  • Expert-level proficiency in TypeScript, JavaScript and Python
  • Proven experience testing AI-powered or machine learning applications, including AI model validation techniques
  • Strong understanding of AWS services (e.g., Lambda, S3, SQS, CloudWatch, ECS/EKS) and how to validate applications deployed in cloud environments
  • Experience with threaded/multi-agent AI tools and how they impact test design and validation
  • Familiarity with version control (Git), containerization (Docker), and CI/CD pipelines (e.g., GitHub Actions, Jenkins, or CircleCI)
  • Strong communication, leadership, and mentoring skills
Job Responsibility
Job Responsibility
  • Lead the design and implementation of end-to-end test automation frameworks using Cypress with TypeScript
  • Define quality strategies for applications with AI/ML components, including deterministic and non-deterministic testing approaches
  • Collaborate with engineering, DevOps, and AI/ML teams to ensure quality across AI-infused features in production environments
  • Build and scale testing strategies for threaded AI applications running in AWS cloud infrastructure
  • Integrate automated tests into CI/CD pipelines to support frequent, reliable deployments
  • Mentor and guide mid- and senior-level QA engineers, setting best practices and driving a culture of quality-first development
  • Evaluate and introduce new tools, libraries, and frameworks to improve test coverage, performance, and developer experience
  • Participate in architectural discussions to ensure testability and reliability are baked into software designs from the start
  • Analyze test results, track quality metrics, and communicate risk and coverage to stakeholders
What we offer
What we offer
  • Premium healthcare through OSDE for the employee and their immediate family members
  • Mendel prepaid card with a monthly allowance for grocery purchases
  • Monthly reimbursements for Wi-Fi/electricity expenses
  • Monthly reimbursements for training/English classes
  • 100% covered “Plan Total” membership at Sportclub
  • Access to a our benefits platform through Bonda
  • A flexible vacation policy
  • Fulltime
Read More
Arrow Right

Principal Frontend Software Engineer - Design Systems & AI

We’re looking for a passionate Principal Engineer (P60) to join the Design Syste...
Location
Location
Australia
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A strong interest in AI, especially in generative approaches for frontend code that adheres to design systems and frontend standards
  • Systems thinking and experience architecting and maintaining large-scale systems (100+ packages, content, standards, etc.)
  • Proven Tech Lead experience: You’ve led complex technical initiatives and mentored other engineers
  • Experience with Javascript (ES6), HTML5, CSS and experience with modern Javascript frameworks (e.g., React, AngularJS, Vue)
  • Bachelor's or Master's degree (preferably a Computer Science degree or equivalent experience)
  • Extensive experience with modern testing frameworks (e.g., Jest, Cypress, Mocha, Chai)
  • Strong comfortability with the JavaScript language and ecosystem
  • Experience in design system best practices
Job Responsibility
Job Responsibility
  • Lead the technical vision and architecture for AI-driven design system solutions, ensuring scalability, reliability, and compliance with Atlassian’s frontend standards
  • Drive the development of generative AI tools that produce frontend code aligned with our design system and accessibility requirements
  • Tackle the challenges of maintaining and evolving a system of 100+ packages, including content, standards, and tooling
  • Mentor and guide engineers across the team, fostering a culture of technical excellence and innovation
  • Collaborate with cross-functional partners to deliver impactful solutions that elevate the user experience for millions of Atlassian customers
What we offer
What we offer
  • health coverage
  • paid volunteer days
  • wellness resources
Read More
Arrow Right

Principal Engineer, SSD Firmware Engineering

We are seeking a talented Principal Engineer, Firmware Engineering to join our i...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
sandisk.com Logo
Sandisk
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Engineering, Electronics, Electrical Engineering, or related field
  • 10+ years of experience in firmware development for embedded systems
  • Strong proficiency in C/C++ programming languages
  • In-depth knowledge of microcontroller architectures and embedded systems
  • Experience with real-time operating systems (RTOS) and their implementation
  • Familiarity with hardware interfaces such as SPI, I2C, I3C, UART, and GPIO
  • Expertise in developing and debugging low-level device drivers
  • Proficiency in using version control systems, preferably Git
  • Strong analytical and problem-solving skills with attention to detail
  • Experience with firmware testing and validation methodologies
Job Responsibility
Job Responsibility
  • Design, develop, and implement firmware for embedded systems and microcontrollers
  • Collaborate with hardware engineers to integrate firmware with electronic components
  • Optimize firmware for performance, power consumption, and memory usage
  • Develop and maintain device drivers for various hardware interfaces
  • Implement and integrate real-time operating systems (RTOS) in firmware projects
  • Conduct code reviews and ensure adherence to coding standards and best practices
  • Debug and resolve firmware issues using specialized tools and techniques
  • Participate in firmware testing and validation processes
  • Document firmware architecture, design decisions, and implementation details
  • Stay up-to-date with the latest trends and technologies in firmware engineering
  • Fulltime
Read More
Arrow Right

Principal Engineer

The Principal AI/ML Operations Engineer leads the architecture, automation, and ...
Location
Location
United States , Pleasanton, California
Salary
Salary:
251000.00 - 314500.00 USD / Year
blackline.com Logo
BlackLine
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Data Science, or a related field
  • 10+ years in ML infrastructure, DevOps, and software system architecture
  • 4+ years in leading MLOps or AI Ops platforms
  • Strong programming skills in languages such as Python, Java, or Scala
  • Expertise in ML frameworks (TensorFlow, PyTorch, scikit-learn) and orchestration tools (Airflow, Kubeflow, Vertex AI, MLflow)
  • Proven experience operating production pipelines for ML and LLM-based systems across cloud ecosystems (GCP, AWS, Azure)
  • Deep familiarity with LangChain, LangGraph, ADK or similar agentic system runtime management
  • Strong competencies in CI/CD, IaC, and DevSecOps pipelines integrating testing, compliance, and deployment automation
  • Hands-on with observability stacks (Prometheus, Grafana, Newrelic) for model and agent performance tracking
  • Understanding of governance frameworks for Responsible AI, auditability, and cost metering across training and inference workloads
Job Responsibility
Job Responsibility
  • Define enterprise-level standards and reference architectures for ML-Ops and AIOps systems
  • Partner with data science, security, and product teams to set evaluation and governance standards (Guardrails, Bias, Drift, Latency SLAs)
  • Mentor senior engineers and drive design reviews for ML pipelines, model registries, and agentic runtime environments
  • Lead incident response and reliability strategies for ML/AI systems
  • Lead the deployment of AI models and systems in various environments
  • Collaborate with development teams to integrate AI solutions into existing workflows and applications
  • Ensure seamless integration with different platforms and technologies
  • Define and manage MCP Registry for agentic component onboarding, lifecycle versioning, and dependency governance
  • Build CI/CD pipelines automating LLM agent deployment, policy validation, and prompt evaluation of workflows
  • Develop and operationalize experimentation frameworks for agent evaluations, scenario regression, and performance analytics
What we offer
What we offer
  • short-term and long-term incentive programs
  • robust offering of benefit and wellness plans
  • Fulltime
Read More
Arrow Right

Principal AI Architect

We are seeking an experienced AI Architect to lead the design, implementation, a...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
evoluteiq.com Logo
EvoluteIQ
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of experience in data science, ML engineering and AI system architecture
  • Hands-on experience with Python, TensorFlow, PyTorch, Scikit-learn, spaCy and related AI/ML frameworks
  • Expertise in MLOps tools such as MLflow, Kubeflow, Vertex AI, or SageMaker
  • Proficiency in data processing technologies (Spark, Kafka, Airflow) and data modeling
  • Strong background in deploying models such as APIs or services using Docker, Kubernetes, and REST/gRPC
  • Experience designing data pipelines and integrating AI with production systems
  • Should have an understanding of prompt engineering, LLM fine-tuning, and vector stores (e.g. Pinecone, FAISS, Weaviate)
  • Knowledge of cloud AI services (AWS, GCP, Azure) and distributed computing architectures
  • Proven experience implementing observability for models (drift, accuracy, bias, and performance)
Job Responsibility
Job Responsibility
  • Architect and oversee AI/ML pipelines covering data collection, preparation, training, validation, and inference
  • Define and implement scalable AI infrastructure for training, deployment, and continuous integration (MLOps)
  • Collaborate with data scientists, ML engineers, product manager, and product teams to translate business problems into AI-driven solutions
  • Establish frameworks for model governance, versioning, reproducibility, and explainability
  • Integrate models into production systems ensuring low latency, scalability, and reliability
  • Define data strategy, storage, and access patterns to support AI workloads
  • Build solutions to monitor model performance, drift, and data quality, implementing continuous retraining strategies
  • Ensure compliance with ethical AI, data privacy, and security best practices
  • Mentor AI/ML engineers and contribute to architectural decisions across the AI platform stack
What we offer
What we offer
  • Opportunity to shape the strategy of a next-gen hyper-automation platform
  • Work with a cross-disciplinary team in a fast-growing, innovation-driven environment
  • Competitive compensation and growth opportunities
  • A culture of innovation, ownership, and continuous learning
  • Fulltime
Read More
Arrow Right

Principal AI/ML & Innovation Engineer

We are seeking Principal AI/ML & Innovation Engineer who will be leading initiat...
Location
Location
Puerto Rico , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master’s degree in computer science, engineering, data science, machine learning, artificial intelligence, or closely related quantitative discipline
  • Typically, 10-15 years’ experience
  • Solid understanding of fundamental AI and machine learning concepts, including supervised and unsupervised learning, deep learning, reinforcement learning, natural language processing, computer vision, and statistical modeling
  • Proficient in implementing and deploying various machine learning algorithms, such as decision trees, random forests, support vector machines, and neural networks
  • Knowledge of popular machine learning frameworks and libraries like TensorFlow, PyTorch, or sci-kit
  • Strong understanding of GitHub CoPilot, Cursor, N8N, vibe coding, Windsurf, and similar technologies
  • Experience in Cloud Infrastructure (AWS, Azure, etc)
  • Knowledge of Open Source, Linux, etc
  • Understanding of Devops, SRE
  • Expertise in deep learning techniques, architectures, and frameworks (e.g., convolutional neural networks (CNN), recurrent neural networks (RNN), generative adversarial networks (GAN), etc.)
Job Responsibility
Job Responsibility
  • Designing, developing, and deploying advanced machine learning models and algorithms
  • Leading research initiatives to explore novel approaches and technologies
  • Designing the architecture of AI systems and ensuring scalability, performance, and reliability
  • Collaborating with other teams, such as data scientists, software engineers, and product managers
  • Providing technical leadership and mentorship to junior engineers
  • Overseeing and guiding multiple design review sessions across different projects
  • Partnering with the engineering manager and team lead to establish long-term design and implementation strategies
  • Leading efforts to incorporate feedback loops and continuous improvement processes
  • Leading meetings, ensuring efficient progress tracking, issue resolution, and team coordination
  • Creating and delivering high-level presentations and reports to executive stakeholders
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right