Machine Learning Engineer - Inference Job at Together AI (San Francisco)

Senior Machine Learning Engineer, Personalization and Recommendations

As a Senior Machine Learning Engineer on the Personalization & Recommendations t...

Location

United States , San Francisco

Salary:

183360.00 - 248000.00 USD / Year

EdTech Jobs

Expiration Date

Until further notice

Requirements

5+ years of experience in applied machine learning or ML-heavy software engineering, with a strong focus on personalization, ranking, or recommendation systems
Demonstrated impact improving key metrics such as CTR, retention, or engagement through recommender or search systems in production
Strong hands-on skills in Python and PyTorch, with expertise in data and feature engineering, distributed training and inference on GPUs, and familiarity with modern MLOps practices — including model registries, feature stores, monitoring, and drift detection
Deep understanding of retrieval and ranking architectures, such as Two-Tower models, deep cross networks, Transformers, or MMoE, and the ability to apply them to real-world problems
Experience with large-scale embedding models and vector search, including FAISS, ScaNN, or similar systems
Proficiency in experiment design and evaluation, connecting offline metrics (AUC, NDCG, calibration) with online A/B test outcomes to drive product decisions
Clear, effective communication, collaborating well with product managers, data scientists, engineers, and cross-functional partners
A growth and mentorship mindset, helping elevate team quality in modeling, experimentation, and reliability
Commitment to responsible and inclusive personalization, ensuring our systems respect learner privacy, fairness, and diverse goals

Job Responsibility

Design and implement personalization models across candidate retrieval, ranking, and post-ranking layers, leveraging user embeddings, contextual signals and content features
Develop scalable retrieval and serving systems using architectures such as Two-Tower models, deep ranking networks, and ANN-based vector search for real-time personalization
Build and maintain model training, evaluation, and deployment pipelines, ensuring reliability, training–serving consistency, observability, and robust monitoring
Partner with Product and Data Science to translate learner objectives (engagement, retention, mastery) into measurable modeling goals and experiment designs
Advance evaluation methodologies, contributing to offline metric design (e.g., NDCG, CTR, calibration) and supporting rigorous A/B testing to measure learner and business impact
Collaborate with platform and infrastructure teams to optimize distributed training, inference latency, and serving cost in production environments
Stay informed on industry and research trends, evaluating opportunities to meaningfully apply them within Quizlet’s ecosystem
Mentor junior and mid-level engineers, supporting technical growth, experimentation rigor, and responsible ML practices
Champion collaboration, inclusion, curiosity, and data-driven problem solving, contributing to a healthy and productive team culture

What we offer

20 vacation days
Competitive health, dental, and vision insurance (100% employee and 75% dependent PPO, Dental, VSP Choice)
Employer-sponsored 401k plan with company match
Access to LinkedIn Learning and other resources to support professional growth
Paid Family Leave, FSA, HSA, Commuter benefits, and Wellness benefits
40 hours of annual paid time off to participate in volunteer programs of choice

Fulltime

Engineering Manager - Machine Learning

We’re looking for an experienced Engineering Manager to lead the ML Soundtrack t...

Location

Sweden , Stockholm

Salary:

Not provided

Epidemic Sound

Expiration Date

Until further notice

Requirements

Deep ML engineering background with hands-on experience in generative diffusion models for audio/music (including PyTorch and modern training stacks)
Proven experience deploying ML systems into production at scale, with a focus on latency, stability, and cost
Strong ML system design and architecture skills across the full machine learning lifecycle
Track record of managing engineering teams
Demonstrated ability to set clear goals, manage performance, and grow engineers through mentorship and feedback

Job Responsibility

Own the technical roadmap and model strategy for generative music, including diffusion and transformer-based approaches
Lead the full lifecycle from research to production, championing training, evaluation, and deployment for real-time inference
Drive the productionisation of inference through model optimisation (distillation, quantisation), caching, and cost controls
Build and maintain team health through effective rituals, 1:1s, and fostering a psychologically safe, high-ownership culture
Manage cross-team dependencies and delivery with data, MLOps, and product engineering teams

Fulltime

Software Engineer, Machine Learning

Figma is seeking a versatile and experienced Machine Learning / AI Engineer to j...

Location

United States , San Francisco; New York

Salary:

149000.00 - 350000.00 USD / Year

Figma

Expiration Date

Until further notice

Requirements

5+ years of industry experience in software engineering
3+ years focused on applied machine learning or AI
Strong experience with end-to-end ML model development, including training, evaluation, deployment, and monitoring
Proficiency in Python and familiarity with ML libraries like PyTorch, TensorFlow, Scikit-learn, Spark MLlib, or XGBoost
Experience designing and building scalable data and annotation pipelines, as well as evaluation systems for AI model quality
Experience mentoring or leading others and contributing to a culture of technical excellence and innovation

Job Responsibility

Design, build, and productionize ML models for Search, Discovery, Ranking, Retrieval-Augmented Generation (RAG), and generative AI features
Build and maintain scalable data pipelines to collect high-quality training and evaluation datasets, including annotation systems and human-in-the-loop workflows
Collaborate with AI researchers to iterate on datasets, evaluation metrics, and model architectures to improve quality and relevance
Work with product engineers to define and deliver impactful AI features across Figma’s platform
Partner with infrastructure engineers to develop and optimize systems for training, inference, monitoring, and deployment
Explore new ideas at the edge of what’s technically possible and help shape the long-term AI vision at Figma

What we offer

equity
health, dental & vision benefits
retirement with company contribution
parental leave & reproductive or family planning support
mental health & wellness benefits
generous PTO
company recharge days
a learning & development stipend
a work from home stipend
cell phone reimbursement

Fulltime

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...

Location

United States , San Francisco

Salary:

216500.00 - 324500.00 USD / Year

GoFundMe

Expiration Date

Until further notice

Requirements

9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
Extensive experience designing, developing, and operating scalable backend systems
Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)

Job Responsibility

Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure

What we offer

Competitive pay
Comprehensive healthcare benefits
Financial assistance for things like hybrid work, family planning
Generous parental leave
Flexible time-off policies
Mental health and wellness resources
Learning, development, and recognition programs

Fulltime

Senior Machine Learning Engineer

A venture-backed startup at the intersection of AI and national security is buil...

Location

United States , New York City Metropolitan Area

Salary:

Not provided

Orbis Consultants

Expiration Date

Until further notice

Requirements

Strong coding background in Python or Golang
Experience taking ML/LLM systems from prototype to production
Skills in Kubernetes, Docker, or cloud infrastructure (AWS preferred)
Someone who thrives in early-stage environments and enjoys solving hard technical problems

Job Responsibility

Build and scale production-grade ML services, including LLM applications
Develop APIs and infrastructure that integrate seamlessly into mission-critical workflows
Work hands-on across the stack: from systems and storage through to inference and application layers
Tackle complex data security and privacy challenges in real-world environments

What we offer

Significant equity
Strong health & wellness benefits
Work on technology that truly matters in national security
Join a small, sharp team where your work will have immediate impact

Fulltime

Engineering Manager - Machine Learning Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...

Location

United States , San Francisco

Salary:

241200.00 - 400000.00 USD / Year

Plaid

Expiration Date

Until further notice

Requirements

8–10 years of experience in ML infrastructure, including direct hands-on expertise as an engineer, IC/TL
2+ years of experience managing infrastructure or ML platform engineers
Proven experience delivering and operating ML or AI infrastructure at scale
Solid technical depth across ML/AI infrastructure domains (e.g., feature stores, pipelines, deployment, inference, observability)
Demonstrated ability to drive execution on complex technical projects with cross-team stakeholders
Strong communication and stakeholder management skills

Job Responsibility

Lead and support the ML Infra team, driving project execution and ensuring delivery on key commitments
Build and launch Plaid’s next-generation feature store to improve reliability and velocity of model development
Define and drive adoption of an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
Ensure operational excellence of ML pipelines, deployment tooling, and inference systems
Partner with ML product teams to understand requirements and deliver solutions that accelerate model development and iteration
Recruit, mentor, and develop engineers, fostering a collaborative and high-performing team culture

What we offer

medical
dental
vision
401(k)
equity
commission

Fulltime

Sr. Machine Learning Engineer, AdTech

As a member of our Data Science Engineering team, the Sr. Machine Learning Engin...

Location

United Kingdom

Salary:

Not provided

PulsePoint

Expiration Date

Until further notice

Requirements

5 years minimum of experience in machine learning/data science
Key Skills: Python, Algorithms, Optimisation, NLP, Data Mining, Statistical Analysis, Neural Networks, Generalised Linear Regression, Multiclass Classification, Java, R
Advanced knowledge of Python using standard DS packages (numpy/pandas/scikit, etc.)
Being able to optimize and speed-up code.
3+ years of RTB Auction or similar online technologies.
Algorithms and Data Structures (e.g., sorting, search tree, binary heap, trie
time & mem complexities of algorithms)
Probability and Statistics (e.g., hypothesis testing
Markov process and its stationary distributions, stochastic matrix and its properties
Bayesian inference)

Job Responsibility

Analyzing and optimizing real-time bidding strategies and online auction mechanics
Developing new or improving existing models of event predictions
New feature engineering for multiple machine learning models: User embeddings and clustering
fraud detection, etc.
Cross-device user identification, cookieless mechanisms development
Mining different data sources
Supporting existing codebase for data integration and production support for our core models.

Fulltime

Senior Machine Learning Engineer

Start.io, a leading mobile marketing and audience platform, empowers the app eco...

Location

Salary:

Not provided

Start.io

Expiration Date

Until further notice

Requirements

B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related technical discipline
5+ years of experience building high-performance backend or ML inference systems
Deep expertise in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML)
Experience with scalable service architecture, message queues (Kafka, Pub/Sub), and async processing
Strong understanding of model deployment practices, online/offline feature parity, and real-time monitoring
Experience in cloud environments (AWS, GCP, or OCI) and container orchestration (Kubernetes)
Experience working with in-memory and NoSQL databases (e.g. Aerospike, Redis, Bigtable) to support ultra-fast data access in production-grade ML services
Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry) and best practices for alerting and diagnostics
A strong sense of ownership and the ability to drive solutions end-to-end
Passion for performance, clean architecture, and impactful systems

Job Responsibility

Own and lead the design and development of low-latency Algo inference services handling billions of requests per day
Build and scale robust real-time decision-making engines, integrating ML models with business logic under strict SLAs
Collaborate closely with DS to deploy models seamlessly and reliably in production
Design systems for model versioning, shadowing, and A/B testing at runtime
Ensure high availability, scalability, and observability of production systems
Continuously optimize latency, throughput, and cost-efficiency using modern tooling and techniques
Work independently while interfacing with cross-functional stakeholders from Algo, Infra, Product, Engineering, BA & Business

What we offer

Lead the mission-critical inference engine that drives our core product
Join a high-caliber Algo group solving real-time, large-scale, high-stakes problems
Work on systems where every millisecond matters, and every decision drives real value
Enjoy a fast-paced, collaborative, and empowered culture with full ownership of your domain

Machine Learning Engineer - Inference

Together AI

Location:
United States , San Francisco

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 18, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Machine Learning Engineer - Inference