Member of Technical Staff, LLM Evaluation Job at Microsoft Corporation (Mountain View)

Member of Technical Staff, Data Analysis and Evaluation

As a Member of Technical Staff in Data Analysis and Evaluation, you will play a ...

Location

Salary:

Not provided

Cohere

Expiration Date

Until further notice

Requirements

Extremely strong software engineering skills
Strong expertise in designing and conducting data collection tasks, including working with human annotators
Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance
Experience analysing datasets with respect to their quality, biases, and suitability for training ML models
Hands-on experience training large language models (LLMs) on distributed training infrastructures
Familiarity with evaluating and improving the generalisability and robustness of ML systems
Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX)
Excellent communication skills to collaborate effectively with cross-functional teams and present findings
One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)

Job Responsibility

Design and oversee data collection tasks, including supporting human annotators and ensuring data quality
Develop and apply statistical methods to evaluate the quality and reliability of datasets
Analyse and assess the generalisability and robustness of ML systems across diverse use cases
Collaborate with teams to improve dataset quality and model performance
Train and fine-tune large language models (LLMs) on distributed training infrastructures
Conduct experiments to evaluate model performance and identify areas for improvement

What we offer

An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
6 weeks of vacation (30 working days!)

Fulltime

Member of Technical Staff

The Microsoft AI Super Intelligence Post-Training team is dedicated to advancing...

Location

India , Bangalore

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor’s or master’s degree in computer science, Engineering, or a related field, or equivalent practical experience
5+ years of professional experience, including 2+ years with Python and ML frameworks such as PyTorch or TensorFlow
Hands-on experience with training or fine-tuning LLMs or multimodal models
Familiarity with production ML systems and concepts like model serving, caching, batching, and monitoring
Understanding of distributed systems and cloud-based infrastructure

Job Responsibility

Implement large-scale model training, especially with LLMs, SLMs, multimodal, or code-specific models
Develop robust evaluation frameworks to assess model performance, conduct systematic benchmarking, and address identified weaknesses while ensuring compliance with customer standards
Write efficient, production-quality code and debug complex distributed systems
Build and maintain internal tools to streamline training and evaluation workflows and automate repetitive tasks within secure development environments

Fulltime

Member of Technical Staff - Post Training - MAI Superintelligence Team

At Microsoft AI, we are on a mission to develop the most cutting-edge algorithms...

Location

United States , Mountain View

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science, Machine Learning, Mathematics, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience
Have experience with reward modeling, RL, or other post-training techniques

Job Responsibility

Develop data collection, evaluation, and post-training methods for models
Design hypotheses and experiment plans for rapidly iterating on model performance

Fulltime

Member of Technical Staff, Software Engineer

Help build the infrastructure that powers training, evaluation, and data platfor...

Location

Switzerland , Zürich

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Strong software engineering background building reliable, scalable production systems (Python preferred)
Hands‑on experience supporting large‑scale ML / LLM training, evaluation, or experimentation infrastructure
Operating GPU‑heavy workloads in cloud environments using Docker and Kubernetes (scheduling, utilization, isolation)
Designing and running data / compute pipelines and orchestration (e.g., Airflow, Argo) with object storage (Azure Blob / S3)
Platform reliability and operability: observability, metrics, logging, tracing, alerting (Prometheus, Grafana, OpenTelemetry)

Job Responsibility

Design and build core platform services for scalable training and evaluation, including cluster orchestration, job scheduling, data and compute pipelines, and artifact management
Standardize containerized workflows by maintaining Docker images, CI/CD, and runtime configurations
advocate for best practices in security, reproducibility, and cost efficiency
Implement end-to-end observability and operations through metrics, tracing, logging, dashboard development, monitoring, and automated alerts for model training and platform health (using Prometheus, Grafana, OpenTelemetry)
Architect and operate services on Azure cloud platforms, managing infrastructure-as-code (Terraform/Helm), secrets, networking, and storage
Enhance developer experience by creating tools, CLIs, and portals that simplify job submission, metrics analysis, and experiment management for generalist software engineering and research teams
Enforce security and compliance policies for data access, container hardening, and supply-chain integrity, and partner with security and privacy teams to maintain robust practices in multi-tenant environments and secret management
Collaborate cross-functionally with data, model, and product teams to align infrastructure roadmaps with training needs, evaluation protocols, and Copilot product goals

Fulltime

Member of Technical Staff - Machine Learning

As a Member of Technical Staff - Machine Learning (AI Team), you will work to cr...

Location

United States , Mountain View

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Doctorate in Computer Science, Machine Learning, Human-Centered AI or related field AND 2+ year(s) experience (e.g., finetuning models with supervision or reinforcement learning, understanding and fixing data quality and curation, working with collaborators on creating new products)
OR Master's Degree in Computer Science, Machine Learning, or related field AND 5+ years experience (e.g., managing structured and unstructured data, developing and debugging models, creating infrastructure for AI-powered products)
OR Bachelor's Degree in Computer Science, Mathematics, Machine Learning, Physics, or related field AND 7+ years data-science experience (e.g., managing structured and unstructured data, applying machine learning techniques and driving product direction)
Demonstrated engineering experience or research experience (e.g. creating or leading the creation of a feature in a different company, complex graduate work, research papers, or other experience)
4+ years of data science experience (e.g., managing structured and unstructured data, applying machine learning techniques and driving product direction)
Experience prompting, evaluating, and working with large language models
Experience writing production-quality Python code

Job Responsibility

Leverage subject matter expertise to improve model quality for interactive and agentive experiences
Oversee data acquisition or generation efforts, ensuring that the data meets the model needs
Generalize machine learning (ML) solutions into repeatable frameworks
Lead evaluation efforts of models, including those deployed within Microsoft products and the Cloud API
Track advances in industry and academia, identifies relevant state-of-the-art research, and adapts algorithms and/or techniques to drive innovation and develop new solutions
Independently write efficient, readable, extensible code and model pipelines
Commit to a customer-oriented focus by acknowledging customer needs and perspectives and building AI products that delight customers

Fulltime

Member of Technical Staff, Principal Engineering Manager

As Microsoft continues to push the boundaries of AI, we are on the lookout for s...

Location

United States , Redmond

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, Javascript, or Python OR equivalent experience
Demonstrated track record of building and scaling engineering organizations (hiring teams from scratch, structuring orgs, growing managers)
Experience delivering large-scale software systems in AI, machine learning, or related fields
Experience managing organizations of 30+ engineers across multiple teams and workstreams
Deep expertise in LLM evaluation, AI quality measurement, or ML infrastructure at scale
Track record of partnering with senior leadership (VP/CVP level) to set strategy and drive cross-organizational programs
Experience recruiting and developing senior engineering talent (principal engineers, engineering managers) in a competitive market
Proven ability to operate effectively in fast-paced, ambiguous environments — comfortable making decisions with incomplete information and course-correcting quickly
Strong technical judgment: ability to evaluate architectural tradeoffs, assess technical risk, and guide teams toward sound engineering decisions without needing to write the code yourself
Experience leading distributed or multi-site engineering teams.

Job Responsibility

Build and lead a multi-team engineering organization (30+ engineers across multiple teams), including hiring and developing engineering managers who lead their own teams
Set the technical and organizational strategy for Copilot AI Evaluation and response quality, aligning with MAI's broader product and engineering vision
Partner with senior Eng and Product leadership (Partner+ level) to define priorities, influence roadmaps, and drive cross-organizational initiatives
Own end-to-end delivery of evaluation platforms, novel evaluation techniques, and agentic solutions for measuring and improving Copilot quality at scale
Recruit, develop, and retain world-class engineering talent — building a culture of technical excellence, accountability, and continuous learning
Drive operational rigor: establish engineering processes, quality bars, and delivery cadences that enable predictable, high-quality execution across multiple concurrent workstreams
Navigate ambiguity and make high-judgment tradeoff decisions on technology, staffing, and investment priorities in a fast-moving AI landscape
Foster a diverse, inclusive team culture where engineers at all levels can do their best work and grow their careers
Embody our Culture and Values.

Fulltime

Member of Technical Staff

The Microsoft AI Superintelligence Post Training team is dedicated to advancing ...

Location

United States , Redmond

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Doctorate in relevant field AND 3+ years related research experience OR Master's Degree in relevant field AND 4+ years related research experience OR Bachelor's Degree in relevant field AND 6+ years related research experience OR equivalent experience
5+ years of coding experience in Python and experience with ML frameworks such as PyTorch and Triton
3+ years of experience in data curation and synthesis, creating and refining datasets to optimize training outcomes
3+ years of proven ability to design and scale training infrastructure and pipelines in production environments
3+ years of large-scale model training - especially with LLMs, SLMs, multimodal, or code-specific models
Prior research publication record with over 3000 citations
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Job Responsibility

Perform large-scale model training - Especially with LLMs, SLMs, multimodal, or code-specific models
Perform data curation and synthesis - Creating and refining datasets to optimize training outcomes
Hands-on coding- write efficient, production-quality code and debug complex training jobs
Work on both proprietary and open-source frameworks - Demonstrated proficiency in training pipelines and architecture
Full-stack modeling responsibility - From data ingestion and training to evaluation and inference management
Contribute to or build on existing innovations like technical report of the well-known models
Develop novel AI solutions that bridge language, vision, and code understanding
Help develop models powering tools like GitHub Copilot, Cursor, and VS Code suggestions
Embody our Culture and Values

Fulltime

Member of Technical Staff - Machine Learning

As a Member of Technical Staff - Machine Learning, you will work to create LLM m...

Location

United States , Mountain View

Salary:

163000.00 - 296400.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience
Demonstrated engineering experience or research experience (e.g. creating or leading the creation of a feature in a different company, complex graduate work, research papers, or other experience)
Experience prompting, evaluating, and working with large language models
Experience writing production-quality Python code

Job Responsibility

Own and pursue a research agenda to improve model capability and performance for agentive application
Collaborate closely with the other research and product teams, from pretraining to model hosting to unlock new model capabilities
Build robust evaluations for tracking modeling improvements
Design, implement, test, and debug code across our research stack

Fulltime

Select Country

Member of Technical Staff, LLM Evaluation

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Member of Technical Staff, LLM Evaluation

Member of Technical Staff, Data Analysis and Evaluation

Member of Technical Staff

Member of Technical Staff - Post Training - MAI Superintelligence Team

Member of Technical Staff, Software Engineer

Member of Technical Staff - Machine Learning

Member of Technical Staff, Principal Engineering Manager

Member of Technical Staff

Member of Technical Staff - Machine Learning

Our AI answers in your language