CrawlJobs Logo

Principal AI Platform Engineer

Canada, Mississauga 169000.00 - 188000.00 CAD / Year · Job Posted January 16, 2026
Apply Position
Job Link Share

Job Description

The Principal AI Platform Engineer will focus on building the infrastructure that connects AI systems with existing products and will enable seamless delivery of AI-generated insights into agent workflows.

Job Responsibility

  • Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools
  • Implement secure access controls and authentication mechanisms integrated by default into the AI platform components
  • Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure
  • Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications
  • Optimize infrastructure for scalability, high availability, cost efficiency for production workloads

Requirements

  • Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security
  • Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow)
  • Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs

Nice to have

  • Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads
  • Experience with CI/CD pipelines and automation for AI model deployment and platform operations
  • Strong knowledge of authentication and authorization frameworks integrated into AI platforms

What we offer

bonus + benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Principal AI Platform Engineer

8 matching positions

Principal AI Platform Engineer

Location
Location
United States
Salary
Salary:
Not provided
pointclickcare.com Logo
PointClickCare
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Built or supported infrastructure for Generative AI or LLM-based products
  • Hands-on experience deploying and managing AI or ML workloads on Kubernetes
  • Familiar with observability tools like OpenTelemetry or MLFlow for monitoring AI systems
  • Worked with vector databases (like Pinecone, Weaviate, FAISS, pgvector) or LLM serving frameworks like vLLM or SGLang
  • Experience adding authentication and authorization controls into AI or ML platforms
  • Legally authorized to work in the US for our company
  • Fulltime
Read More
Arrow Right

Principal AI Platform Engineer

AlphaSense is seeking an experienced engineering leader to transform how we buil...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
alpha-sense.com Logo
AlphaSense
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years building and operating distributed systems in production
  • Track record of improving system reliability (taking services from frequent outages to 99.9%+ uptime)
  • Deep expertise in modern engineering practices: microservices, containerization (Kubernetes), infrastructure as code
  • Experience leading technical initiatives and mentoring engineering teams
  • Strong coding skills with ability to work across the stack
  • Excellence in debugging production issues and implementing comprehensive observability
  • History of making pragmatic trade-offs between perfect and shipped
Job Responsibility
Job Responsibility
  • Architect for Scale: Design and implement distributed systems that power AI agents processing thousands of requests per hour, ensuring reliability, performance, and cost-efficiency
  • Build Engineering Excellence: Establish comprehensive testing strategies, observability systems, and CI/CD pipelines that catch issues before customers do
  • Lead Through Expertise: Mentor a team of smart, motivated engineers, sharing your experience in building production systems that don't break at 3 AM
  • Drive Platform Evolution: Own the technical roadmap for our AI platform, making architectural decisions that will shape our systems for years to come
  • Bridge AI and Engineering: Collaborate with ML engineers and researchers to productionize cutting-edge AI capabilities while maintaining system stability
Read More
Arrow Right

Principal AI Platform Engineer

The Principal AI Platform Engineer will focus on building the infrastructure tha...
Location
Location
United States
Salary
Salary:
179000.00 - 199000.00 USD / Year
pointclickcare.com Logo
PointClickCare
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security
  • Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow)
  • Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs
Job Responsibility
Job Responsibility
  • Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools
  • Implement secure access controls and authentication mechanisms integrated by default into the AI platform components
  • Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure
  • Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications
  • Optimize infrastructure for scalability, high availability, cost efficiency for production workloads
What we offer
What we offer
  • bonus
  • benefits
  • Fulltime
Read More
Arrow Right

Principal Engineer - AI Platform Development (Azure PostgreSQL)

Microsoft’s Azure Data engineering team is leading the transformation of analyti...
Location
Location
Spain , Barcelona
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND extensive technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Lead the design and development of AI Store capabilities in Azure PostgreSQL, including vector search, semantic indexing, and AI-optimized database features to power the next generation of intelligent applications
  • Architect intuitive, scalable APIs, SDKs, and extensibility layers that bring advanced database and AI capabilities into the hands of developers
  • Create seamless developer experiences by integrating PostgreSQL services with modern development tools, frameworks, and cloud platforms accelerating application development on Azure PostgreSQL
  • Partner closely with database engine engineers, product managers, and developer advocates to translate developer needs into deep system and platform innovations
  • Design and deliver high-quality interfaces, SDKs, samples, and documentation that make building AI-powered applications on PostgreSQL accessible, powerful, and joyful
  • Engage with open-source communities, technology partners, and developer ecosystems to amplify impact, gather feedback, and inform platform evolution
  • Champion a developer-first mindset while advancing technical excellence, scalability, and innovation across the stack from database internals to developer workflows
  • Fulltime
Read More
Arrow Right

Principal Platform Engineer - Data & AI

As a Principal Platform Engineer - Data & AI, you will architect and lead Data a...
Location
Location
United States , San Diego
Salary
Salary:
190000.00 - 284000.00 USD / Year
resmed.com Logo
ResMed
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS in Computer Science or equivalent experience
  • 10+ years architecting AI/ML and Data Platform solutions
  • Experience building and scaling ML platforms for model training, deployment, and monitoring using large datasets
  • Expertise in modern technology stacks, APIs, microservices, and cloud Data and AI platforms
  • Excellent communication skills with ability to present to all levels of leadership
  • Demonstrated agile mindset focused on iterative improvement
Job Responsibility
Job Responsibility
  • Lead the technical evolution of data and AI/ML capabilities, ensuring incremental solution delivery and proactive risk management
  • Develop deep technical expertise across AI domains to guide cross-team initiatives and optimize productivity
  • Set and maintain high standards for data quality, platform reliability, and security while addressing scalability challenges
  • Drive strategic technical decisions that balance business needs with long-term sustainability across data and AI/ML initiatives
  • Mentor and coach data engineering teams, collaborate with engineering leadership, and provide actionable feedback for team growth
  • Effectively communicate technical concepts to both technical and non-technical stakeholders at all management levels
  • Provide thought leadership in data architecture, governance, and technology selection to shape company-wide technical direction
What we offer
What we offer
  • comprehensive medical, vision, dental, and life, AD&D, short-term and long-term disability insurance
  • sleep care management
  • Health Savings Account (HSA)
  • Flexible Spending Account (FSA)
  • commuter benefits
  • 401(k)
  • Employee Stock Purchase Plan (ESPP)
  • Employee Assistance Program (EAP)
  • tuition assistance
  • fifteen days Paid Time Off (PTO) in their first year of employment
  • Fulltime
Read More
Arrow Right

Principal Fullstack Engineer (AI & Data Platform)

We’re building the intelligence layer for the real estate market. Our platform a...
Location
Location
Multiple
Salary
Salary:
Not provided
onhires.com Logo
OnHires
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience building production systems from scratch
  • Solid backend, frontend and data engineering background (language-agnostic)
  • Comfort working with unstructured data and unstable sources
  • Practical understanding of ML / analytics pipelines (not research-heavy)
  • Experience using LLMs in real systems (beyond simple API calls)
  • Ownership mindset and interest in early-stage, high-impact work
Job Responsibility
Job Responsibility
  • Design and build a scalable data ingestion system for messy, anti-bot-protected sources
  • Create a centralized data / knowledge layer that normalizes and connects property information across sources
  • Develop an analytics & ML layer to compute market signals and detect anomalies
  • Use LLMs as an interface and reasoning layer on top of structured data and computed metrics
  • Work directly with the founders on product decisions and technical direction
  • Establish engineering standards and help scale the team over time
  • Fulltime
Read More
Arrow Right

Principal Ai Engineer (Prisma Browser - Agents Platform)

The Prisma Browser group is building an agentic development lifecycle, an infras...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 8+ years of experience in software development, architecture, or owning operational systems in production
  • Computer Science B.Sc. or equivalent education or equivalent military experience required
  • A product builder's mindset: you can extract requirements, talk to stakeholders, and tell the difference between what's important and what's noise
  • Experience in building production grade agents. Deep understanding of the agent loop, its states and transitions. You know how to build it correctly, not just use it
  • Positive 'can-do' mindset, able to work independently and within a team
  • Hands-on experience with LLM APIs, including a practical, highly-skeptical understanding of token costs, caching, context windows, and model failure points
  • You know how to build the right context for a task, including memory systems, session storage, and vector databases
  • You understand where LLMs fail and how to design around those failure points
  • You've used traces or observability tooling to diagnose and improve agent behavior
  • A systems-level background that touches reliability, observability, or platform engineering, with a strong preference for writing narrow, deterministic code over building hypothetical abstractions
Job Responsibility
Job Responsibility
  • Design and implement automated evaluation loops, static analysis, and rigorous quality gates to ensure the ADLC process doesn't just write code, but consistently produces great, production-ready code
  • Help the team tackle complex, hard problems to elevate our autonomous development product from 'good' to 'excellent'
  • Lead complex initiatives in Context Engineering and Prompt Engineering
  • Manage and orchestrate the complex ecosystem of autonomous agents utilized for internal development
  • Serve as a leading individual in a very strong team professionally and personally
  • Find space for growth to push the entire team or group forward
  • View prompt engineering as a core engineering discipline—where rewriting agent behavior is a versioned, reviewed, and tested code change
  • Act with a debugging temperament
  • conduct deep-dive analyses of raw agent transcripts to diagnose non-deterministic failures and ascertain root causes instead of merely working around them
  • Fulltime
Read More
Arrow Right

Principal Ai Engineer - Enterprise Ai Solutions

Our Mission: At Palo Alto Networks®, we’re united by a shared mission—to protect...
Location
Location
United States , Santa Clara
Salary
Salary:
185200.00 - 299475.00 USD / Year
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in a related field or equivalent military experience required
  • 12+ years of related experience, or a Master's degree with 8 years of experience, or a PhD with 5 years of experience
  • Proven expertise in designing and building complex, enterprise-grade AI/ML platforms and applications
  • Direct hands-on experience with Generative AI technologies, including Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and agentic AI systems
  • Strong programming skills in languages such as Python, Java, or Go and proficiency in modern AI/ML frameworks (e.g., TensorFlow, PyTorch)
  • Extensive hands-on experience with distributed systems architecture, streaming data platforms, and cloud AI/ML platforms (e.g., GCP Vertex AI, AWS SageMaker)
Job Responsibility
Job Responsibility
  • Lead the applied AI solution design and architecture, breaking down ambiguous business problems into concrete, actionable AI solution designs
  • Drive the hands-on development and implementation of key AI components, supporting both traditional and Generative AI model development and deployment
  • Contribute significantly to the detailed design of large-scale, distributed AI/ML systems, ensuring performance, reliability, and security
  • Proactively collaborate with executive leadership, data science, engineering, and product stakeholders to translate business use-cases into scalable AI solutions
  • Provide technical leadership and mentorship to other AI/ML engineers, fostering a culture of engineering excellence and hands-on experimentation
  • Lead the implementation and continuous improvement of MLOps pipelines, including automated model training, versioning, deployment, and monitoring
  • Champion and enforce design standards, patterns, and best practices for scalable and secure development of AI applications across various teams
  • Actively research and evaluate cutting-edge AI/ML techniques, algorithms, and models to identify opportunities for platform enhancement and new solution development
What we offer
What we offer
  • Restricted stock units
  • Bonus
  • Fulltime
Read More
Arrow Right