CrawlJobs Logo

Member of Technical Staff, LLM Evaluation

https://www.microsoft.com/ Logo

Microsoft Corporation

Location Icon

Location:
United States , Mountain View

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

139900.00 - 274800.00 USD / Year

Job Description:

As a Member of Technical Staff, LLM Evaluation, you will develop and implement cutting-edge methodologies to help us evaluate how well Copilot performs in real-world usage scenarios. Users turn to Copilot for all types of endeavors, making it critical that we ensure our AI systems effectively help them meet their needs. Our vision for meeting user needs is expansive and includes not only task completion, but also affective aspects of the experience. You will be responsible for developing new methods to evaluate LLMs, train classifiers, experimenting with data collection techniques, and implementing methodologies to provide real-time signals on Copilot performance. We're looking for outstanding individuals with experience in the social sciences, machine learning, and analysis of natural language. The right candidate is a creative problem solver who will work closely with user researchers and product leaders to build automated evaluation frameworks that help us drive improvements in Copilot.

Job Responsibility:

  • Leverage expertise to measure the performance of Copilot, identify failure modes and novel mitigation strategies, including data mining, prompt engineering, LLM as a judge, and classifier training
  • Creative problem solving, navigating complexity with clarity, independently shaping direction and delivering results even when the path isn’t obvious
  • Create and implement comprehensive evaluation frameworks across diverse scenarios, edge cases, and potential failure modes
  • Build automated testing systems, generalize solutions into repeatable frameworks, and write efficient code for model pipelines and intervention systems
  • Maintain a user-oriented perspective by understanding needs from user perspectives, validating approaches through user research, and serving as a trusted advisor on AI matters
  • Track advances in research, identify relevant state-of-the-art techniques, and adapt algorithms to drive innovation in production systems serving millions of users

Requirements:

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience
  • OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 7+ years data-science experience
  • OR Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 10+ years data science experience
  • OR equivalent experience
  • Experience prompting and working with large language models
  • Experience writing production-quality Python code
  • Demonstrated interest in Responsible AI

Nice to have:

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 8+ years data-science experience
  • OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 10+ years data-science experience
  • OR Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 12+ years data-science experience
  • OR equivalent experience

Additional Information:

Job Posted:
February 10, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff, LLM Evaluation

Member of Technical Staff, Research

As a Member of Technical Staff on the Research team, you’ll push the boundaries ...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 240000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Research background in Artificial Intelligence, Machine Learning, Physics, or similar field
  • Experience solving analytical problems using analytic and quantitative approaches
  • Experience communicating research to audiences with different backgrounds
  • Experience coding in C/C++, Python, or other similar languages
Job Responsibility
Job Responsibility
  • Conduct foundational research to advance the capabilities, efficiency, and reliability of LLMs and multimodal systems
  • Design, implement, and evaluate novel model architectures, training methods, and optimization techniques
  • Collaborate with engineering teams to transition research prototypes into production-grade systems
  • Analyze empirical results, identify performance bottlenecks, and iterate quickly to improve model quality
  • Contribute to internal research strategy by identifying high-impact opportunities and emerging trends in AI
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff – Model Training

At Inflection AI, our public benefit mission is to harness the power of AI to im...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have hands-on experience training and fine-tuning large transformer models on multi-GPU / multi-node clusters
  • Are fluent in PyTorch and its ecosystem tools (Torchtune, FSDP, DeepSpeed) and enjoy digging into distributed-training internals, mixed precision, and memory-efficiency tricks
  • Have shipped or published work in RLHF, DPO, GRPO, or RLAIF and understand their practical trade-offs
  • Care deeply about training tools, pipelines, and reproducibility—you automate the boring parts so you can iterate on the fun parts
  • Balance research curiosity with product pragmatism—you know when to run an ablation and when to ship
  • Communicate crisply with both technical and non-technical teammates
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Contribute to end-to-end post-training workflows—dataset curation, hyper-parameter search, evaluation, and rollout—using PyTorch, Torchtune, FSDP/DeepSpeed, and our internal orchestration stack
  • Prototype and compare alignment techniques (e.g., curriculum RL, multi-objective reward modeling, tool-use fine-tuning) and push the best ideas into production
  • Automate training at scale: build robust pipeline components, tools, scripts, and dashboards so experiments are reproducible and easy to trace
  • Define the metrics that matter
  • run A/B tests and iterate quickly to meet aggressive quality targets
  • Collaborate with inference, safety, and product teams to land improvements in customer-facing systems
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of Technical Staff, Applied Research

The Applied Researcher role is designed for engineers who love working across ML...
Location
Location
United States , San Mateo
Salary
Salary:
Not provided
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS in Computer Science, Electrical Engineering, Machine Learning, or a related field, or equivalent practical experience, open to all levels of experiences
  • Strong experience with PyTorch and modern Transformer architectures
  • Solid computer science fundamentals: data structures, algorithms, concurrency, distributed systems, networking
  • Hands-on experience training, fine-tuning, or evaluating machine learning models, preferably LLMs
  • Familiarity with recent developments in the LLM research domain, including model architectures, training methods, and evaluation strategies
  • Passion for partnering with customers: understanding their constraints, co-designing solutions, and iterating based on real-world feedback
  • Curiosity and enthusiasm for exploring a wide range of problem domains and project types - from quick experiments to long-running, complex engagements
  • Ability to operate in a fast-paced, ambiguous environment and drive projects independently
Job Responsibility
Job Responsibility
  • Sit at the intersection of ML research, systems engineering, and customer-facing problem solving
  • Work hands-on with customers and customer data to tune, evaluate and deploy models using various techniques such as SFT / DPO / RL
  • Help customers build competitive models using their unique data tailored to their unique products
  • Be the technical bridge between customer needs, customer data, and our tuning and serving infrastructure
What we offer
What we offer
  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, AI Post-Training

At Microsoft AI, we are on a mission to develop the most cutting-edge algorithms...
Location
Location
Switzerland , Zürich
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, or related technical discipline AND technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Expertise in post-training of AI models
  • Demonstrated experience in large-scale AI
  • Passionate about conversational AI and its deployment
  • Demonstrated written and verbal communication skills with the ability to work closely with cross-functional teams, including product managers, designers, and other engineers
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in AI
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team
  • Proven research track record in a domain related field supported by exceptional papers
Job Responsibility
Job Responsibility
  • Develop data collection, evaluation, and finetuning methods for models
  • Design hypotheses and experiment plans for rapidly iterating on model performance
  • Prototype new model features and capabilities and collaborate with engineers and researchers across Microsoft AI to make them a reality
  • Collaborate with pretraining and product platform teams to establish good vertical integration and ship models that Copilot users love
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right

Staff LLM Engineer

Join GoFundMe as our next Staff LLM Engineer in the Applied Science organization...
Location
Location
Argentina , Buenos Aires
Salary
Salary:
Not provided
gofundme.com Logo
GoFundMe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of hands-on experience in machine learning, data science, or related fields, with a strong emphasis on research, scientific methodology, and practical engineering applications
  • Extensive knowledge of machine learning algorithms and statistical modeling, coupled with proficiency in Python and ML libraries/frameworks such as TensorFlow, PyTorch, Scikit-learn, and others
  • Experience with LLM-based applications and development, including commercial solutions such as OpenAI API and Anthropic, as well as open-source models and frameworks like Hugging Face, LangChain, Llama, etc.
  • Experience designing, developing, and deploying end-to-end machine learning systems, including data pipelines, model training and serving, and monitoring solutions
  • Strong leadership skills, including guiding projects and mentoring team members, fostering a collaborative and high-performing work environment that values continuous improvement and knowledge sharing
  • Ability to break down complex projects, effectively scope and sequence work, and manage timelines to ensure the timely and successful delivery of machine learning initiatives
  • Excellent verbal and written communication skills, with the ability to convey complex technical concepts to both technical and non-technical stakeholders
  • Advanced degree (Master’s or Ph.D.) in Computer Science, Statistics, Data Science, or a related technical field is preferred
Job Responsibility
Job Responsibility
  • Design and implement diverse AI and ML applications and workflows, including agents, batch inferencing processes, and real-time inferencing solutions
  • Design and contribute to the development of search and retrieval algorithms, recommender systems, personalization algorithms, and more
  • Lead the research and development of sophisticated machine learning models and model training and evaluation pipelines
  • Drive operational excellence by leading initiatives to streamline AI and machine learning workflows and establish standardized procedures to ensure consistent and high-quality outcomes across our AI/ML projects and systems
  • Continuously monitor and evaluate emerging AI and machine learning research and technologies, integrating relevant advancements to enhance our solutions
  • Collaborate with both technical and non-technical colleagues, including data scientists, software engineers, product managers, and business stakeholders, to drive the successful implementation of machine learning projects
  • Coach and mentor fellow engineers, contributing to our culture of collaboration, scientific inquiry, continuous learning, and engineering excellence
  • Employ a diverse set of tools and platforms, including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, and GitHub, to develop, deploy, and maintain scalable and robust machine learning systems
What we offer
What we offer
  • Make an Impact: Be part of a mission-driven organization making a positive difference in millions of lives every year
  • Innovative Environment: Work with a diverse, passionate, and talented team in a fast-paced, forward-thinking atmosphere
  • Collaborative Team: Join a fun and collaborative team that works hard and celebrates success together
  • Competitive Benefits: Enjoy competitive pay and comprehensive healthcare benefits
  • Holistic Support: Enjoy financial assistance for things like hybrid work, family planning, along with generous parental leave, flexible time-off policies, and mental health and wellness resources to support your overall well-being
  • Growth Opportunities: Participate in learning, development, and recognition programs to help you thrive and grow
  • Commitment to DEI: Contribute to diversity, equity, and inclusion through ongoing initiatives and employee resource groups
  • Community Engagement: Make a difference through our volunteering program
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...
Location
Location
United States , San Francisco
Salary
Salary:
216500.00 - 324500.00 USD / Year
gofundme.com Logo
GoFundMe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
  • Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
  • Extensive experience designing, developing, and operating scalable backend systems
  • Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
  • Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
  • Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
  • Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
  • Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
  • Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)
Job Responsibility
Job Responsibility
  • Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
  • Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
  • Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
  • Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
  • Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
  • Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
  • Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
  • Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
  • Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
  • Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure
What we offer
What we offer
  • Competitive pay
  • Comprehensive healthcare benefits
  • Financial assistance for things like hybrid work, family planning
  • Generous parental leave
  • Flexible time-off policies
  • Mental health and wellness resources
  • Learning, development, and recognition programs
  • Fulltime
Read More
Arrow Right
New

Healthcare Assistant

Help us to deliver great primary care by improving access, outcomes and patient ...
Location
Location
United Kingdom , Nottingham
Salary
Salary:
26000.00 - 30000.00 GBP / Year
operosehealth.co.uk Logo
Operose Health
Expiration Date
February 20, 2026
Flip Icon
Requirements
Requirements
  • Experience in a Primary Care setting and phlebotomy is essential
  • Able to work within processes, procedures and maintain confidently and data security
  • Must be able to adapt with changing priorities and be personable, polite and patient with our patients
  • Must have basic PC skills such as Word, Excel and email
  • Ability to use own judgement and be aware of professional boundaries they are working to
Job Responsibility
Job Responsibility
  • Assisting with patient duties as required and support other team members such as clinical and Nurse Lead with patient care
  • Supporting the Practice with duties related to CQC outcomes and ensuring compliance is maintained
  • Working with patients with long term conditions such as Diabetes etc
  • Provide clinical procedures such as new patient health checks, BMI, blood pressure, pulse and simple wound care
  • Completing administrative tasks such as new patient registrations, providing appropriate leaflets, stock control and ordering
What we offer
What we offer
  • 27 days annual leave plus bank holidays pro rata
  • Access to our bespoke learning management system and annual formative clinical assessments to support competency development
  • The benefits of working with an at scale provider of primary care means that we lots of opportunities for our colleagues to specialise and develop
  • Car benefit scheme – specialising in electric vehicles
  • Cycle to work scheme
  • Travel season ticket loans
  • Discount cards
  • Employee wellbeing services including free yoga videos and employee wellbeing app
  • Parttime
Read More
Arrow Right
New

Trainee Dealer

As a Trainee Dealer, you will join our structured Dealer Training School, where ...
Location
Location
United Kingdom , London
Salary
Salary:
25932.00 - 25938.00 GBP / Year
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Aged 18 or above
  • Right to work in the UK
  • No previous dealing experience required
  • Right attitude, commitment, and passion for customer service
  • Flexible scheduling for nights, evenings, weekends, and shifts in a 24/7 trading week
  • Ability to spend a large portion of shift on feet working directly with the public
Job Responsibility
Job Responsibility
  • Participate in structured training sessions to learn the rules and dealing procedures for games such as Blackjack, Roulette and Three Card Poker
  • Develop proficiency in handling gaming equipment (cards, chips, chippers, shufflers etc.) accurately and efficiently
  • Learn and adhere to all gaming regulations, internal controls, and security procedures
  • Maintain a professional and friendly attitude toward guests to create a welcoming and enjoyable gaming environment
  • Ensure integrity and fairness in all game play, following house rules and casino standards
  • Learn to manage table inventory, exchange chips, and accurately calculate payouts
  • Deliver outstanding customer service and create memorable experiences for our guests
  • Work as part of a team to ensure smooth table operations and excellent customer service
  • Demonstrate key service behaviours: On It, Upbeat and Positive Attitude, Be Nice, Open and Close
What we offer
What we offer
  • 50% off food and beverages in all our UK venues
  • Extensive Rewards platform: discounts on travel, retail, hospitality, health and much more
  • Company Sick Pay
  • Company Pension
  • Life Assurance
  • Refer a friend incentive
  • Financial advice services
  • Employee health and wellbeing services
  • Virtual GP Services
  • Season Ticket Loans
  • Fulltime
Read More
Arrow Right