CrawlJobs Logo

Member of Technical Staff - Software Engineer (AI infra)

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
Switzerland , Zürich

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Microsoft AI is looking for a Member of Technical Staff - Software Engineer to help build the next wave of capabilities of our personalized AI assistant, Copilot. We’re looking for someone who will bring an abundance of positive energy, empathy, and kindness to the team every day, in addition to being highly effective. The right candidate enjoys building world-class consumer experiences and products in a fast-paced environment. You will actively contribute to the development of AI models that are powering our innovative products. You will wear multiple hats and work on engineering, research, and everything in between. Your contributions will span model architecture, data curation, training and inference infrastructures, evaluation protocols, alignment and reinforcement learning from human feedback (RLHF), and many other exciting topics at the cutting edge of AI. Microsoft AI is building foundational models to develop novel responsible and efficient artificial general intelligence. The foundational models require large compute-capacity, and as a Member of Technical Staff - Software Engineer you would be responsible to benchmark, profile, debug and tune the training and inference of generative AI running in the production Graphics Processing Unit (GPU) clusters. Sophisticated tools and techniques are needed to maintain the reliability, runtime performance, and health of the hundreds of nodes in a supercomputer consisting of thousands of GPUs. You must work closely with model scientists to instrument best known state-of-the-art and novel tools and techniques to achieve the smooth operation of the AI jobs. As a contributing member of the core group of engineers, you would also bring to the table best practices driving architectural changes and influence roadmap of relevant software and hardware components. Your work will directly impact the business goals of a wide range of users and facilitate the next wave of growth and innovation in AI. Our newly formed organization, Microsoft AI, is dedicated to advancing Copilot and other consumer AI products and research. The team is responsible for Copilot, Bing, Edge, and generative AI research. Come be a part of the team shaping the future personal computing. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Job Responsibility:

  • Develop and tune the pretraining scalable software for Nvidia GB200 72NVL CX8 and AMD MIxxx architectures
  • Benchmark GB200 and AMD MIxxx GPU clusters
  • Gather data and insights to develop the pretraining compute roadmap
  • Care deeply about conversational AI and its deployment
  • Actively contribute to the development of AI models that are powering our innovative products
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
  • Enjoy working in a fast-paced, design-driven, product development cycle
  • Embody our Culture and Values

Requirements:

  • Bachelor's Degree in Computer Science, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Experience with generative AI
  • Experience with distributed computing

Nice to have:

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR equivalent experience
  • Experience in leading technical projects and supporting architectural decisions with data

Additional Information:

Job Posted:
March 05, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31694 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff - Software Engineer (AI infra)

Member of Technical Staff - Data Infra - MAI Superintelligence Team

Help build the world’s most advanced multimodal dataset at Microsoft AI. We are ...
Location
Location
United States , Mountain View
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling, or data engineering
  • OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 8+ years experience in business analytics, data science, software development, data modeling, or data engineering
  • OR equivalent experience
  • Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 12+ years experience in business analytics, data science, software development, data modeling, or data engineering
  • OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 15+ years experience in business analytics, data science, software development, data modeling, or data engineering
  • OR equivalent experience
  • 4+ years experience with data governance, data compliance and/or data security
  • Passionate about the role of data in large-scale AI model training
  • Thrive in a highly collaborative, fast-paced environment
  • Have a high degree of expertise and pay close attention to details
Job Responsibility
Job Responsibility
  • Design and develop data pipelines that ingest enormous amounts of multi-modal training data (text, audio, images, video)
  • Own and maintain critical data infrastructures, including spark, ray, vector databases, and others
  • Build and maintain cutting-edge infrastructure that can store and process the petabytes of data needed to power models
  • Partner with the pretraining and post-training teams to improve our data recipe by rigorous and careful experimentation
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right

Member of Technical Staff - Data Infra - MAI Superintelligence Team

Help build the world’s most advanced multimodal dataset at Microsoft AI. We are ...
Location
Location
United States , Mountain View
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years experience in business analytics, data science, software development, data modeling or data engineering work
  • OR Master’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ year(s) experience in business analytics, data science, software development, or data engineering work
  • OR equivalent experience
  • Bachelor’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 8+ years experience in business analytics, data science, software development, data modeling or data engineering work
  • OR Master’s Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6+ years of business analytics, data science, software development, data modeling or data engineering work experience
  • OR equivalent experience
Job Responsibility
Job Responsibility
  • Design and develop data pipelines that ingest enormous amounts of multi-modal training data (text, audio, images, video)
  • Own and maintain critical data infrastructures, including spark, ray, vector databases, and others
  • Build and maintain cutting-edge infrastructure that can store and process the petabytes of data needed to power models
  • Partner with the pretraining and post-training teams to improve our data recipe by rigorous and careful experimentation
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Training Infra Engineer

Contribute in and provide strong support for model training pipelines, ship stat...
Location
Location
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Extremely strong software engineering skills
  • Proficiency in Python and related ML frameworks such as JAX, Pytorch and XLA/MLIR
  • Experience with distributed training infrastructures (Kubernetes, Slurm) and associated frameworks (Ray)
  • Experience using large-scale distributed training strategies
  • Hands on experience on training large model at scale and having contributed to the tooling and/or setup of the training infrastructure
Job Responsibility
Job Responsibility
  • Design and write high-performant and scalable software for training
  • Improve our training setup from an infrastructure and codebase performance standpoint
  • Craft and implement tools to speed up our training cycles and improve the overall efficacy of our training infrastructure
  • Research, implement, and experiment with ideas on our supercompute and data infrastructure
  • Learn from and work with the best researchers in the field
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Multimodal Infrastructure

Microsoft AI is looking for a Member of Technical Staff, Multimodal Infrastructu...
Location
Location
United States , Mountain View
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Experience in multi-modal data processing: Strong proficiency in distributed data processing infra (resource utilization management, fault tolerance, ray & spark) and CPU/GPU batch processing optimizations
  • Experience with state-of-art model inference and serving frameworks
  • Experience with image/video/audio data processing
  • Experience with common data formats for efficient I/O
  • Experience in multi-modal pretraining and post-training: Strong proficiency in deep learning frameworks such as PyTorch, Megatron and Deepspeed
  • Knowledge of auto-regressive and diffusion transformer models
  • Experience with distributed training techniques such as data parallelism, model parallelism, and pipeline parallelism
  • Proven experiences in at least one of the following areas: image/video generation and editing
  • efficient architectures (e.g., MoE, window attention)
Job Responsibility
Job Responsibility
  • Design, develop and maintain large-scale multimodal data processing pipelines
  • Design, develop and maintain large-scale multimodal model pretraining and post-training frameworks
  • Design, develop and maintain large-scale multimodal model inference and serving frameworks
  • Work with research scientists and product engineers to solve infra-related problems
  • Find a path to get things done despite roadblocks to get your work into the hands of users quickly and iteratively
  • Enjoy working in a fast-paced, design-driven, product development cycle
  • Embody our Culture and Values
  • Fulltime
Read More
Arrow Right
New

Senior Lecturer/Associate Professor in Literacy

As a Senior Lecturer / Associate Professor in Literacy, you will play a key role...
Location
Location
Australia , Albury-Wodonga, Bathurst, Port Macquarie, Wagga Wagga
Salary
Salary:
Not provided
csu.edu.au Logo
Charles Sturt University
Expiration Date
June 08, 2026
Flip Icon
Requirements
Requirements
  • A doctoral qualification relevant to literacy or education, with a recognised teaching qualification
  • A strong record of high-quality teaching and student-centred learning
  • An established or emerging research profile aligned to literacy, curriculum or pedagogy
  • The ability to build productive partnerships and contribute to academic leadership
Job Responsibility
Job Responsibility
  • Lead impactful literacy teaching and research
  • Teach across online and on-campus environments
  • Shape future teachers and education practice
  • Contribute to curriculum innovation
  • Build strong relationships with students and partners
  • Provide academic leadership in literacy education
  • Contribute to the School's research profile
  • Supervise higher degree research students
  • Actively engage with professional, community and government stakeholders
  • At Associate Professor level: significant academic leadership, research impact, and contribution to the broader discipline at national/international level
What we offer
What we offer
  • 17% superannuation
  • Fulltime
Read More
Arrow Right
New

Program Manager - Controls and Avionics Solutions

This position is based in Endicott, New York. New York and on-site work will be ...
Location
Location
United States , Endicott
Salary
Salary:
120874.00 - 205486.00 USD / Year
baesystems.com Logo
Baesystems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in engineering, engineering or manufacturing management, or other discipline
  • Demonstrated ability for building strong customer/ stakeholder relationships
  • Strong communication, negotiation, and presentation skills
  • Ability to interpret data and make data-driven decisions
  • Highly adaptable with strong initiative
  • Demonstrated ability to lead and motivate cross-functional teams
  • Knowledge of the global aviation market and regulatory requirements and/ or military aviation market
Job Responsibility
Job Responsibility
  • Maintaining strong customer relationships and leading a multidisciplinary team to execute complex development programs within schedule and budget
  • Leadership and management oversight of a project team assuring that project’s financials, schedule, and technical objectives are met and that the highest level of customer satisfaction is achieved while meeting all contractual commitments
  • Work effectively and collaboratively with Engineering, Operations, and all Program Office functional leadership to assure deliveries continue to exceed customer commitments and achievement of financial commitments to the company
  • Manages, coordinates, plans, organizes, controls, integrates, and executes projects within the Military Aircraft Systems portfolio
  • Participates in the support of new business and in the development of proposals
What we offer
What we offer
  • Health insurance
  • Dental insurance
  • Vision insurance
  • Health savings accounts
  • 401(k) savings plan
  • Disability coverage
  • Life and accident insurance
  • Employee assistance program
  • Legal plan
  • Discounts on home, auto, and pet insurance
  • Fulltime
Read More
Arrow Right