CrawlJobs Logo

Member of Technical Staff, High Performance Computing Engineer

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Mountain View

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

139900.00 - 274800.00 USD / Year

Job Description:

As Microsoft AI we are pushing the boundaries of technology. We are creating unique, beautiful and powerful products that will change lives. A small, friendly, fast-moving team, we support each other to do the best work of our lives, always looking to break new ground, fast. We are proud of what we build, how we build it and that our products will define the AI era. We run lean, obsess about users, and always make our decisions based on the evidence. We ship regularly, so your work will have real and immediate impact. We are seeking experienced High Performance Computing Engineers to join our team and contribute to the evolution of our personal AI, Copilot! This role offers the unique opportunity to work on some of the largest scale supercomputers in the world, a rare chance to operate at such a significant scale. Our team is at the forefront of developing APIs that enhance our ability to fine-tune and deploy Copilot's core functionalities, in partnership with our Product Management, Design, and AI Research teams. The Member of Technical Staff, High Performance Computing Engineer will bring a wealth of positive energy, empathy, and kindness, coupled with a track record of effectiveness. You'll be proactive, relishing the challenge of crafting top-tier consumer experiences and products swiftly and efficiently. Our newly formed organization, Microsoft AI, is dedicated to advancing Copilot and other consumer AI products and research. The team is responsible for Copilot, Bing, Edge, and generative AI research. Come be a part of the team shaping the future personal computing.

Job Responsibility:

  • Build secure and performant AI Platform services that power Copilot
  • Work collaboratively with other Platform, infrastructure, application engineers as well as AI Researchers to build next generation AI products and services
  • Ship high-quality, well-tested, secure, and maintainable code
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
  • Enjoy working in a fast-paced, design-driven, product development cycle
  • Embody our Culture and Values

Requirements:

  • Bachelor’s degree in computer science, or related technical discipline AND 6+ years technical engineering experience building web services with coding in languages including, but not limited to, Python, C#, C++, Rust, Java
  • OR equivalent experience
  • 6+ years of experience working with high-scale training clusters (ex. working with frameworks/tools such as nvidia InfiniBand clusters, SLURM, Kubernetes, Ray, etc.)
  • 6+ years' experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP

Nice to have:

  • Experience with LLM training clusters
  • Experience working with AI platforms, frameworks, and APIs
  • Experience using Machine Learning frameworks, including experience using, deploying, and scaling language learning models, either personally or professionally
  • Ability to identify, analyze, and resolve complex technical issues, ensuring optimal performance, scalability, and user experience
  • Dedication to writing clean, maintainable, and well-documented code with a focus on application quality, performance, and security
  • Demonstrated interpersonal skills and ability to work closely with cross-functional teams, including product managers, designers, and other engineers
  • Ability to clearly communicate complex technical concepts to both technical and non-technical stakeholders
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in web development and AI
  • Ability to work in a fast-paced environment, manage multiple priorities, and adapt to changing requirements and deadlines
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team

Additional Information:

Job Posted:
January 05, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff, High Performance Computing Engineer

Member of Technical Staff, Performance Optimization

We're looking for a Software Engineer focused on Performance Optimization to hel...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience
  • 5+ years of experience working on performance optimization or high-performance computing systems
  • Proficiency in CUDA or ROCm and experience with GPU profiling tools (e.g., Nsight, nvprof, CUPTI)
  • Familiarity with PyTorch and performance-critical model execution
  • Experience with distributed system debugging and optimization in multi-GPU environments
  • Deep understanding of GPU architecture, parallel programming models, and compute kernels
Job Responsibility
Job Responsibility
  • Optimize system and GPU performance for high-throughput AI workloads across training and inference
  • Analyze and improve latency, throughput, memory usage, and compute efficiency
  • Profile system performance to detect and resolve GPU- and kernel-level bottlenecks
  • Implement low-level optimizations using CUDA, Triton, and other performance tooling
  • Drive improvements in execution speed and resource utilization for large-scale model workloads (LLMs, VLMs, and video models)
  • Collaborate with ML researchers to co-design and tune model architectures for hardware efficiency
  • Improve support for mixed precision, quantization, and model graph optimization
  • Build and maintain performance benchmarking and monitoring infrastructure
  • Scale inference and training systems across multi-GPU, multi-node environments
  • Evaluate and integrate optimizations for emerging hardware accelerators and specialized runtimes
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...
Location
Location
United States , New York, NY; San Mateo, CA; Redwood City, CA
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Strong software development skills in languages like Python, or C++
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization
Job Responsibility
Job Responsibility
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...
Location
Location
United States , San Francisco
Salary
Salary:
216500.00 - 324500.00 USD / Year
gofundme.com Logo
GoFundMe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
  • Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
  • Extensive experience designing, developing, and operating scalable backend systems
  • Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
  • Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
  • Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
  • Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
  • Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
  • Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)
Job Responsibility
Job Responsibility
  • Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
  • Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
  • Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
  • Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
  • Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
  • Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
  • Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
  • Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
  • Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
  • Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure
What we offer
What we offer
  • Competitive pay
  • Comprehensive healthcare benefits
  • Financial assistance for things like hybrid work, family planning
  • Generous parental leave
  • Flexible time-off policies
  • Mental health and wellness resources
  • Learning, development, and recognition programs
  • Fulltime
Read More
Arrow Right

Staff Software Development Engineer

Design and develop software applications and platforms to support digital strate...
Location
Location
United States , Woonsocket
Salary
Salary:
147680.00 - 240000.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree (or foreign equivalent) in Computer Science, Computer Engineering, Information Technology, Engineering, or a related field
  • 5 years of progressive, postbaccalaureate experience in the job offered or related occupation
  • 5 years of experience in Agile methodologies or SAFe Software Development Principles
  • 5 years of experience with Databases, including Oracle and SQL
  • 5 years of experience with JIRA, Rally, or Confluence
  • 5 years of experience with Java, MySQL, or NoSQL
  • 5 years of experience with Software development lifecycle (SDLC)
  • 5 years of experience with Software testing, quality assurance, and troubleshooting
  • 5 years of Domain support for healthcare or retail organization
  • 5 years of experience Developing backend services, performing code reviews, and collaborating with peers on software development solutions
Job Responsibility
Job Responsibility
  • Design and develop software applications and platforms to support digital strategies and solutions
  • Analyze user needs and develop software solutions to meet business requirements
  • Determine feasibility of solutions design and prepare technical design documentation
  • Upgrade existing software applications and/or systems to improve functionality and features with a focus on performance, reliability, and maintainability
  • Write and review high quality code and perform unit and/or automation testing
  • Develop and deploy application components and support unit testing and bug fixes
  • Participate in AGILE Scrum meetings and/or CI/CD
  • Support applications, systems, and databases used to process prescriptions, claims, and related healthcare activities
  • Collaborate with cross-functional teams on applications development, technical requirements, code review, project deliverables, quality assurance, and software development best practices
  • Mentor junior team members
What we offer
What we offer
  • Medical benefits
  • Dental benefits
  • Vision benefits
  • 401(k) retirement savings plan
  • Employee Stock Purchase Plan
  • Fully-paid term life insurance plan
  • Short-term disability benefits
  • Long term disability benefits
  • Well-being programs
  • Education assistance
  • Fulltime
Read More
Arrow Right

Staff Software Engineer

This SaaS product connects millions of JVM runtimes, collects and aggregates det...
Location
Location
Serbia , Belgrade
Salary
Salary:
Not provided
azul.com Logo
Azul Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in Java/Kotlin covering technical architecture, algorithms, design, network management, application development, middleware, AWS/GCP, RDBMS, NoSQL, messaging
  • 5+ years of experience in one or more of the following areas: scalable distributed systems, cloud optimizations and costs, monitoring and alerting, reliable and fault-tolerant systems with performance in mind
  • Experience as an architect or technical lead with customer-facing large-scale products
  • Passionate about simplicity and efficiency, hate for complexity
  • Strong technical problem-solver
  • Positive, enjoys collaborating and communicating with others
  • Experienced in communicating and working across functions to drive solutions
  • Holds BS/MS degree in Computer Science, Engineering, Mathematics or a related field or equivalent experience
Job Responsibility
Job Responsibility
  • Implement new features, fix issues and perform code reviews in Java
  • Participate in designs and architecture decisions
  • Provide unique insights into cloud architecture
  • Translation of complex functional, technical, and business requirements into designs
  • Understanding risk-driven/spiral development approach and enforcing proofs-of-concept and prototypes to validate and compare design alternatives
  • Performing cost/benefit and trade-off analyses of design alternatives
  • Defining high-level development tasks, providing estimates, and identifying skills necessary for implementation
  • Recommending strategies for SaaS monitoring, performance improvements, and capacity planning
  • Being a charismatic team player with exceptional collaboration and communication skills
  • Driving the team's goals & technical direction to pursue opportunities that make the larger organization more efficient
What we offer
What we offer
  • Equity Program
  • Annual bonus based on company performance
  • Referral Program
  • IT Equipment - MacBook Pro or any other HW according to your preferences
  • Work-life balance - 5 weeks of holidays, 5 sick days, flexible working hours, 100% work from home also possible
  • Offices in Belgrade City Centre - if you prefer
  • Work with top experts worldwide who contribute to the Java ecosystem
  • Fulltime
Read More
Arrow Right

Staff Software Engineer

This SaaS product connects millions of JVM runtimes, collects and aggregates det...
Location
Location
Czech Republic , Prague
Salary
Salary:
Not provided
azul.com Logo
Azul Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in Java/Kotlin covering technical architecture, algorithms, design, network management, application development, middleware, AWS/GCP, RDBMS, NoSQL, messaging
  • 5+ years of experience in one or more of the following areas: scalable distributed systems, cloud optimizations and costs, monitoring and alerting, reliable and fault-tolerant systems with performance in mind
  • Experience as an architect or technical lead with customer-facing large-scale products
  • Passionate about simplicity and efficiency, hate for complexity
  • Strong technical problem-solver
  • Positive, enjoys collaborating and communicating with others
  • Experienced in communicating and working across functions to drive solutions
  • Holds BS/MS degree in Computer Science, Engineering, Mathematics or a related field or equivalent experience
Job Responsibility
Job Responsibility
  • Implement new features, fix issues and perform code reviews in Java
  • Participate in designs and architecture decisions
  • Provide unique insights into cloud architecture
  • Translation of complex functional, technical, and business requirements into designs
  • Understanding risk-driven/spiral development approach and enforcing proofs-of-concept and prototypes to validate and compare design alternatives
  • Performing cost/benefit and trade-off analyses of design alternatives
  • Defining high-level development tasks, providing estimates, and identifying skills necessary for implementation
  • Recommending strategies for SaaS monitoring, performance improvements, and capacity planning
  • Being a charismatic team player with exceptional collaboration and communication skills
  • Driving the team's goals & technical direction to pursue opportunities that make the larger organization more efficient
What we offer
What we offer
  • Equity Program
  • Annual bonus based on company performance
  • Referral Program
  • IT Equipment - MacBook Pro or any other HW according to your preferences
  • Work-life balance - 5 weeks of holidays, 5 sick days, flexible working hours, 100% work from home also possible
  • Offices in Prague City Centre - if you prefer
  • Work with top experts worldwide who contribute to the Java ecosystem
  • Fulltime
Read More
Arrow Right

Staff Machine Learning Engineer

We are seeking a Staff Machine Learning Engineer to join our Foundation AI team....
Location
Location
United States , Boston
Salary
Salary:
170000.00 - 230000.00 USD / Year
whoop.com Logo
Whoop
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree (Master’s or Ph.D.) in Computer Science, Machine Learning, Electrical Engineering, or a related field, or equivalent professional experience
  • 7+ years of experience in applied ML, AI research, or large-scale modeling, with a track record of delivering production systems
  • Expertise in modern deep learning (e.g., transformers, state space models) and multimodal model training
  • Proficiency in Python and deep learning frameworks (e.g., PyTorch, TensorFlow)
  • Experience building and scaling large datasets and training large models in distributed compute environments
  • Strong applied experience with representation learning, self-supervised methods, and fine-tuning for downstream applications
  • Familiarity with MLOps best practices including model versioning, evaluation, CI/CD for ML, and cloud-based compute
  • Excellent communication skills and ability to collaborate cross-functionally with engineers, researchers, and product teams
  • Passion for WHOOP’s mission to improve human performance and extend healthspan through science and technology
Job Responsibility
Job Responsibility
  • Design, train, and optimize large-scale multimodal foundation models that integrate wearable sensor data, text, biomarkers, and behavioral data
  • Conduct applied research in self-supervised learning, representation learning, and downstream task fine tuning to advance WHOOP’s core model capabilities
  • Develop scalable, distributed training pipelines for large models on high-performance compute environments
  • Collaborate with MLOps, data engineering, and software engineering teams to operationalize models for production deployment, ensuring robustness, reproducibility, and observability
  • Partner with product and research teams to translate foundation model capabilities into downstream features that deliver meaningful member value
  • Contribute to the technical roadmap and architectural direction for foundation model development at WHOOP
  • Serve as a technical mentor for other data scientists, sharing best practices in deep learning, large-scale training, and multimodal data integration
  • Ensure models adhere to WHOOP’s standards for ethical, transparent, and privacy-preserving AI
What we offer
What we offer
  • competitive base salaries
  • meaningful equity
  • benefits
  • generous equity package
  • Fulltime
Read More
Arrow Right

Staff Software Development Engineer

Design and develop software applications and platforms to support digital strate...
Location
Location
United States , Woonsocket
Salary
Salary:
151195.00 - 207000.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Information Systems, Engineering, Information Technology, or related field
  • 5 years of experience in the job offered or related occupation
  • 5 years of experience with CI/CD, Jenkins, GIT, or DevOps
  • 5 years of experience programming in Python, PowerShell, or JavaScript
  • 5 years of experience with cloud technologies: Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP)
  • 5 years of experience with cloud components including cluster management
  • 5 years of experience with Agile methodologies or SAFe Software Development Principles
  • 5 years of experience with Docker or Kubernetes
  • 5 years of experience with JIRA, Rally, or Confluence
  • 5 years of experience with software testing, quality assurance, and troubleshooting
Job Responsibility
Job Responsibility
  • Design and develop software applications and platforms to support digital strategies and solutions
  • Analyze user needs and develop software solutions to meet business requirements
  • Determine feasibility of solutions design and prepare technical design documentation
  • Upgrade existing software applications and/or systems to improve functionality and features with focus on performance, reliability, and maintainability
  • Write and review high quality code and perform unit and/or automation testing
  • Develop and deploy application components and support unit testing and bug fixes
  • Participate in AGILE Scrum meetings and/or CI/CD
  • Support applications, systems, and databases used to process prescriptions, claims, and related healthcare activities
  • Collaborate with cross-functional teams on applications development, technical requirements, code review, project deliverables, quality assurance, and software development best practices
  • Mentor junior team members
What we offer
What we offer
  • Medical benefits
  • Dental benefits
  • Vision benefits
  • 401(k) retirement savings plan
  • Employee Stock Purchase Plan
  • Fully-paid term life insurance plan
  • Short-term disability benefits
  • Long-term disability benefits
  • Well-being programs
  • Education assistance
  • Fulltime
Read More
Arrow Right