CrawlJobs Logo

Member of Technical Staff, AI Platform Engineer

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Mountain View

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

119800.00 - 234700.00 USD / Year

Job Description:

As Microsoft continues to push the boundaries of AI, we are on the lookout for passionate individuals to work with us on the most interesting and challenging AI questions of our time. Our vision is bold and broad — to build systems that have true artificial intelligence across agents, applications, services, and infrastructure. It’s also inclusive: we aim to make AI accessible to all — consumers, businesses, developers — so that everyone can realize its benefits. At Microsoft AI, AI is part of everything we do. Our AI Product Acceleration team is dedicated to enabling rapid development and deployment of AI across our products. We collaborate closely with the top AI engineers in the world, working at the bleeding edge of engineering and AI research. If you are passionate about AI, eager to work at the forefront of technology, and want to be part of a team that empowers all other teams at Microsoft Copilot to build AI products quickly and seamlessly - join us as an AI Platform Engineer!

Job Responsibility:

  • Design, develop, and maintain platform-level software solutions
  • Collaborate with cross-functional teams to integrate AI capabilities into various products
  • Ensure the reliability, scalability, and performance of platform components
  • Stay updated with the latest advancements in AI and engineering
  • Work alongside the technical staff and AI researchers to improve model development flows
  • Embody our Culture and Values

Requirements:

  • Bachelor's Degree in Computer Science, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to TypeScript, Python, C, C++, C#, Java
  • OR equivalent experience
  • Bachelor’s degree in computer science, or related technical discipline AND 6+ years technical engineering experience building web services with coding in languages including, but not limited to: Python, Golang, Java/Scala, Rust
  • 6+ years' experience in building and releasing production software at the platform level
  • Deep experience with all of the following languages: Golang, Java/Scala, Typescript (React/Next.js)
  • Experience in model pretraining, post-training, evaluation, and inference
  • Experience using Machine Learning frameworks, including experience using, deploying, and scaling language learning models, either personally or professionally
  • Ability to clearly communicate complex technical concepts to both technical and non-technical stakeholders
  • Demonstrated interpersonal skills and ability to work closely with cross-functional teams, including product managers, designers, and other engineers
  • Experience going from zero-to-one as well as working with developed systems
  • Ability to work in a fast-paced environment, manage multiple priorities, and adapt to changing requirements and deadlines

Additional Information:

Job Posted:
March 25, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff, AI Platform Engineer

Member of Technical Staff - Backend Engineer

Endor Labs is building the Application Security platform for the software develo...
Location
Location
Netherlands
Salary
Salary:
70000.00 - 100000.00 EUR / Year
https://www.endorlabs.com Logo
Endor Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or higher degree in engineering, with 6–8 years of experience building scalable backends for product/SaaS companies
  • At least 3 years of experience in Golang programming, with a focus on microservices and distributed architecture
  • Practical experience designing APIs with one or more frameworks (gRPC [preferred], REST, GraphQL, Thrift, etc.)
  • Affinity with modern AI platforms (OpenAI, Google Gemini, LangChain, etc.)
  • Ability to build and design technical solutions from scratch, with code and documentation that exemplify best practices at Endor
  • Scalable distributed systems experience—understanding microservices, domain-driven design, load balancing, horizontal/vertical scaling, and stateless architectures
  • Strong architectural knowledge, with a keen eye for scalable and extensible systems. Able to apply data-driven techniques to evaluate and recommend architectural choices
  • Ability to discuss trade-offs between architectural decisions and influence teams toward the right direction
  • Comfort working in a fast-moving environment with evolving requirements
  • Creative and independent problem-solving skills, especially in uncharted or ambiguous contexts
Job Responsibility
Job Responsibility
  • Work closely with the R&D team to help integrate novel solutions and scale them to production
  • Design and implement AI-first platforms
  • Have the autonomy and responsibility to design and implement high-quality features used by customers
  • Lead and contribute to large-scale technical projects, ensuring scalability, reliability, and performance
  • Design, architect, and build features end-to-end—including unit and integration tests—while working closely with Product Management and our distributed engineering team
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, AI Training Infrastructure

As a Training Infrastructure Engineer, you'll design, build, and optimize the in...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience
  • 3+ years of experience with distributed systems and ML infrastructure
  • Experience with PyTorch
  • Proficiency in cloud platforms (AWS, GCP, Azure)
  • Experience with containerization, orchestration (Kubernetes, Docker)
  • Knowledge of distributed training techniques (data parallelism, model parallelism, FSDP)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for large-scale model training workloads
  • Develop and maintain distributed training pipelines for LLMs and multimodal models
  • Optimize training performance across multiple GPUs, nodes, and data centers
  • Implement monitoring, logging, and debugging tools for training operations
  • Architect and maintain data storage solutions for large-scale training datasets
  • Automate infrastructure provisioning, scaling, and orchestration for model training
  • Collaborate with researchers to implement and optimize training methodologies
  • Analyze and improve efficiency, scalability, and cost-effectiveness of training systems
  • Troubleshoot complex performance issues in distributed training environments
What we offer
What we offer
  • meaningful equity in a fast-growing startup
  • comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff – Backend

As a backend engineer at Inflection, you will own the platforms, systems, and se...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience building and scaling backend systems for high-throughput applications
  • Fluent in building distributed systems with Python, Go, Rust, or similar languages
  • Comfortable with cloud-native architectures (e.g., Kubernetes, gRPC, Postgres, Redis, Kafka)
  • Owned backend services end-to-end—from design and implementation to deployment, monitoring, and debugging
  • Thrive in fast-paced environments where you can move quickly without sacrificing engineering rigor
  • Proactively improve tooling and infrastructure to support teammates’ workflows and reliability goals
  • Communicate clearly across disciplines and take pride in solving user-facing problems with clean backend solutions
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Design and implement scalable backend systems and APIs that power production LLM experiences, including agentic workflows, memory systems, and tool integrations
  • Build and operate high-availability infrastructure to support real-time inference, retrieval, and conversation pipelines
  • Develop internal platforms to improve engineering productivity—CI/CD pipelines, service templates, observability frameworks, and rollout tooling
  • Collaborate closely with applied research and frontend teams to rapidly prototype, ship, and iterate on end-user features
  • Ensure systems meet our high bar for security, uptime, and latency—through incident response, load testing, monitoring, and automation
  • Participate in on-call rotations to maintain the reliability of the services you build
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...
Location
Location
United States , New York, NY; San Mateo, CA; Redwood City, CA
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Strong software development skills in languages like Python, or C++
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization
Job Responsibility
Job Responsibility
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...
Location
Location
United States , San Francisco
Salary
Salary:
216500.00 - 324500.00 USD / Year
gofundme.com Logo
GoFundMe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
  • Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
  • Extensive experience designing, developing, and operating scalable backend systems
  • Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
  • Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
  • Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
  • Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
  • Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
  • Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)
Job Responsibility
Job Responsibility
  • Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
  • Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
  • Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
  • Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
  • Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
  • Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
  • Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
  • Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
  • Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
  • Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure
What we offer
What we offer
  • Competitive pay
  • Comprehensive healthcare benefits
  • Financial assistance for things like hybrid work, family planning
  • Generous parental leave
  • Flexible time-off policies
  • Mental health and wellness resources
  • Learning, development, and recognition programs
  • Fulltime
Read More
Arrow Right

Staff Machine Learning Engineer

Join PagerDuty as a Staff Machine Learning Engineer to tackle complex problems, ...
Location
Location
Canada , Toronto
Salary
Salary:
156000.00 - 232000.00 CAD / Year
https://www.pagerduty.com Logo
PagerDuty
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building, designing, and evolving data architecture for large-scale systems
  • Excellent communication skills
  • Experience working with Product teams, ensuring and driving a timely delivery
  • Have a deep understanding of the trade-offs to be considered when designing and delivering machine learning solutions to production
  • Experience leading cross-team architecture discussions, building technical prototypes, and driving the adoption of best practices across diverse teams
  • Demonstrated experience with data engineering processes, working with unstructured data and cloud-based data infrastructures
  • Passionate about ML engineering and interested in driving discussions with stakeholders and executives
Job Responsibility
Job Responsibility
  • Build and improve the capabilities of the data platform that enable and accelerate the production of ML/AI-based solutions
  • Drive and define standards for AI/ML across the organization
  • Provide guidance, technical leadership, and mentoring to other members of the team
  • Mentor junior members and participate in scaling up the existing team
  • Proactively recommend improvements and new approaches addressing potential systemic pain points and technical debt
  • Anticipate technical demands on the data platform based on the organization’s roadmap and systematically drive the evolution of the architecture toward those ends
  • Develop a long-term plan for ML/AI investments
What we offer
What we offer
  • Competitive salary
  • Comprehensive benefits package from day one
  • Flexible work arrangements
  • Company equity
  • ESPP (Employee Stock Purchase Program)
  • Retirement or pension plan
  • Generous paid vacation time
  • Paid holidays and sick leave
  • Dutonian Wellness Days & HibernationDuty - companywide paid days off in addition to PTO
  • Paid parental leave: 22 weeks for pregnant parent, 12 weeks for non-pregnant parent
  • Fulltime
Read More
Arrow Right

Senior Technical Marketing Engineer

Designs, develops, troubleshoots and debugs software programs for software enhan...
Location
Location
United States
Salary
Salary:
117500.00 - 270000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Information Systems, or equivalent
  • Typically 6-10 years experience
  • Extensive experience with multiple software systems design tools and languages
  • Solid understanding of AI/ML concepts and technologies
  • Excellent analytical and problem-solving skills
  • Experience in overall architecture of software systems for products and solutions
  • Designing and integrating software systems running on multiple platform types into overall architecture
  • Evaluating forms and processes for software systems testing and methodology, including writing and execution of test plans, debugging, and testing scripts and tools
  • Excellent written and verbal communication skills
  • mastery in English and local language
Job Responsibility
Job Responsibility
  • Leads multiple project teams of other software systems engineers and internal and outsourced development partners responsible for all stages of design and development for complex products and platforms, including solution design, analysis, coding, testing, and integration
  • Manages and expands relationships with internal and outsourced development partners on software systems design and development
  • Reviews and evaluates designs and project activities for compliance with systems design and development guidelines and standards
  • provides tangible feedback to improve product quality and mitigate failure risk
  • Provides domain-specific expertise and overall software systems leadership and perspective to cross-organization projects, programs, and activities
  • Drives innovation and integration of new technologies into projects and activities in the software systems design organization
  • Provides guidance and mentoring to less-experienced staff members
  • Content Creation: Develop technical content to educate developers, customers, and partners on how to utilize AI platforms and software, including blog posts, user guides, whitepapers, presentations, and videos
  • Work closely with product teams to align product positioning with market needs and competitive landscape
  • Technical Support and Enablement: Provide technical support to sales teams and customers, respond to inquiries, and assist with product launches and technical sales training. This might involve setting up test environments and troubleshooting issues
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right
New

Member of Technical Staff, AI Platform Engineer

As Microsoft continues to push the boundaries of AI, we are on the lookout for p...
Location
Location
United States , Mountain View
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, or related technical discipline AND 4+ years technical engineering experience building customer-facing applications/products with coding in languages including, but not limited to C#, Python, Java, Golang
  • OR equivalent experience
  • Experience leveraging generative AI technologies to develop innovative and user-focused product features
  • 4+ years' experience building APIs and creating pipelines for large-scale products
  • 4+ years' experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP. Extensive use datastores like RDBMS, key-value stores, etc.
Job Responsibility
Job Responsibility
  • Work on building new AI features that enhance copilot
  • Build secure and performant AI Platform services that power Copilot
  • Work collaboratively with other AI Researchers, Platform, infrastructure, application engineers to build next generation AI products and services
  • Ship high-quality, well-tested, secure, and maintainable code
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
  • Enjoy working in a fast-paced, design-driven, product development cycle
  • Embody our Culture and Values.
  • Fulltime
Read More
Arrow Right