CrawlJobs Logo

Machine Learning Engineer - Inference

together.ai Logo

Together AI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

160000.00 - 230000.00 USD / Year

Job Description:

Together AI is seeking a Machine Learning Engineer to join our Inference Engine team, focusing on optimizing and enhancing the performance of our AI inference systems. This role involves working with state-of-the-art large language models models and ensuring they run efficiently and effectively at scale. If you are passionate about AI inference, PyTorch, and developing high-performance systems, we want to hear from you. This position offers the chance to collaborate closely with AI researchers and engineers to create cutting-edge AI solutions. Join us in shaping the future at Together AI!

Job Responsibility:

  • Design and build the production systems that power the Together AI inference engine, enabling reliability and performance at scale
  • Develop and optimize runtime inference services for large-scale AI applications
  • Collaborate with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world
  • Conduct design and code reviews to ensure high standards of quality
  • Create services, tools, and developer documentation to support the inference engine
  • Implement robust and fault-tolerant systems for data ingestion and processing

Requirements:

  • 3+ years of experience writing high-performance, well-tested, production-quality code
  • Proficiency with Python and PyTorch
  • Demonstrated experience in building high performance libraries and tooling
  • Excellent understanding of low-level operating systems concepts including multi-threading, memory management, networking, storage, performance, and scale

Nice to have:

  • Knowledge of existing AI inference systems such as TGI, vLLM, TensorRT-LLM, Optimum
  • Knowledge of AI inference techniques such as speculative decoding
  • Knowledge of CUDA/Triton programming
  • Knowledge of Rust, Cython and compilers
What we offer:
  • competitive compensation
  • startup equity
  • health insurance
  • other competitive benefits

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Machine Learning Engineer - Inference

Senior Machine Learning Engineer, Personalization and Recommendations

As a Senior Machine Learning Engineer on the Personalization & Recommendations t...
Location
Location
United States , San Francisco
Salary
Salary:
183360.00 - 248000.00 USD / Year
edtechjobs.io Logo
EdTech Jobs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in applied machine learning or ML-heavy software engineering, with a strong focus on personalization, ranking, or recommendation systems
  • Demonstrated impact improving key metrics such as CTR, retention, or engagement through recommender or search systems in production
  • Strong hands-on skills in Python and PyTorch, with expertise in data and feature engineering, distributed training and inference on GPUs, and familiarity with modern MLOps practices — including model registries, feature stores, monitoring, and drift detection
  • Deep understanding of retrieval and ranking architectures, such as Two-Tower models, deep cross networks, Transformers, or MMoE, and the ability to apply them to real-world problems
  • Experience with large-scale embedding models and vector search, including FAISS, ScaNN, or similar systems
  • Proficiency in experiment design and evaluation, connecting offline metrics (AUC, NDCG, calibration) with online A/B test outcomes to drive product decisions
  • Clear, effective communication, collaborating well with product managers, data scientists, engineers, and cross-functional partners
  • A growth and mentorship mindset, helping elevate team quality in modeling, experimentation, and reliability
  • Commitment to responsible and inclusive personalization, ensuring our systems respect learner privacy, fairness, and diverse goals
Job Responsibility
Job Responsibility
  • Design and implement personalization models across candidate retrieval, ranking, and post-ranking layers, leveraging user embeddings, contextual signals and content features
  • Develop scalable retrieval and serving systems using architectures such as Two-Tower models, deep ranking networks, and ANN-based vector search for real-time personalization
  • Build and maintain model training, evaluation, and deployment pipelines, ensuring reliability, training–serving consistency, observability, and robust monitoring
  • Partner with Product and Data Science to translate learner objectives (engagement, retention, mastery) into measurable modeling goals and experiment designs
  • Advance evaluation methodologies, contributing to offline metric design (e.g., NDCG, CTR, calibration) and supporting rigorous A/B testing to measure learner and business impact
  • Collaborate with platform and infrastructure teams to optimize distributed training, inference latency, and serving cost in production environments
  • Stay informed on industry and research trends, evaluating opportunities to meaningfully apply them within Quizlet’s ecosystem
  • Mentor junior and mid-level engineers, supporting technical growth, experimentation rigor, and responsible ML practices
  • Champion collaboration, inclusion, curiosity, and data-driven problem solving, contributing to a healthy and productive team culture
What we offer
What we offer
  • 20 vacation days
  • Competitive health, dental, and vision insurance (100% employee and 75% dependent PPO, Dental, VSP Choice)
  • Employer-sponsored 401k plan with company match
  • Access to LinkedIn Learning and other resources to support professional growth
  • Paid Family Leave, FSA, HSA, Commuter benefits, and Wellness benefits
  • 40 hours of annual paid time off to participate in volunteer programs of choice
  • Fulltime
Read More
Arrow Right

Engineering Manager - Machine Learning

We’re looking for an experienced Engineering Manager to lead the ML Soundtrack t...
Location
Location
Sweden , Stockholm
Salary
Salary:
Not provided
epidemicsound.com Logo
Epidemic Sound
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep ML engineering background with hands-on experience in generative diffusion models for audio/music (including PyTorch and modern training stacks)
  • Proven experience deploying ML systems into production at scale, with a focus on latency, stability, and cost
  • Strong ML system design and architecture skills across the full machine learning lifecycle
  • Track record of managing engineering teams
  • Demonstrated ability to set clear goals, manage performance, and grow engineers through mentorship and feedback
Job Responsibility
Job Responsibility
  • Own the technical roadmap and model strategy for generative music, including diffusion and transformer-based approaches
  • Lead the full lifecycle from research to production, championing training, evaluation, and deployment for real-time inference
  • Drive the productionisation of inference through model optimisation (distillation, quantisation), caching, and cost controls
  • Build and maintain team health through effective rituals, 1:1s, and fostering a psychologically safe, high-ownership culture
  • Manage cross-team dependencies and delivery with data, MLOps, and product engineering teams
  • Fulltime
Read More
Arrow Right

Software Engineer, Machine Learning

Figma is seeking a versatile and experienced Machine Learning / AI Engineer to j...
Location
Location
United States , San Francisco; New York
Salary
Salary:
149000.00 - 350000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience in software engineering
  • 3+ years focused on applied machine learning or AI
  • Strong experience with end-to-end ML model development, including training, evaluation, deployment, and monitoring
  • Proficiency in Python and familiarity with ML libraries like PyTorch, TensorFlow, Scikit-learn, Spark MLlib, or XGBoost
  • Experience designing and building scalable data and annotation pipelines, as well as evaluation systems for AI model quality
  • Experience mentoring or leading others and contributing to a culture of technical excellence and innovation
Job Responsibility
Job Responsibility
  • Design, build, and productionize ML models for Search, Discovery, Ranking, Retrieval-Augmented Generation (RAG), and generative AI features
  • Build and maintain scalable data pipelines to collect high-quality training and evaluation datasets, including annotation systems and human-in-the-loop workflows
  • Collaborate with AI researchers to iterate on datasets, evaluation metrics, and model architectures to improve quality and relevance
  • Work with product engineers to define and deliver impactful AI features across Figma’s platform
  • Partner with infrastructure engineers to develop and optimize systems for training, inference, monitoring, and deployment
  • Explore new ideas at the edge of what’s technically possible and help shape the long-term AI vision at Figma
What we offer
What we offer
  • equity
  • health, dental & vision benefits
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer

Help design our AI platform and develop our next generation of machine learning ...
Location
Location
United States , San Francisco
Salary
Salary:
216500.00 - 324500.00 USD / Year
gofundme.com Logo
GoFundMe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 9+ years of hands-on experience in machine learning engineering, AI development, software engineering, or related fields
  • Experience emphasizing secure, large-scale, distributed system design, AI/ML pipeline development, and implementation
  • Extensive experience designing, developing, and operating scalable backend systems
  • Experience applying software engineering best practices such as domain-driven design, event-driven architectures, and microservices
  • Deep expertise in agentic workflows, AI evaluation solutions, prompt management, and secure AI development and testing practices
  • Strong knowledge of relational and document-based databases, data storage paradigms, and efficient RESTful API design
  • Experience establishing robust CI/CD pipelines, automated testing (unit and integration), and deployment practices
  • Strong leadership skills, including effective planning and management of complex projects, mentoring of team members, and fostering a collaborative, high-performing engineering culture
  • Excellent communicator, able to articulate complex technical concepts clearly to both technical and non-technical stakeholders
  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field (preferred)
Job Responsibility
Job Responsibility
  • Design and implement AI platforms to enable scalable and secure access to LLMs from multiple model providers for diverse use cases
  • Design and implement agentic workflows, agentic tool ecosystems, and LLM prompt management solutions
  • Design, build, and optimize scalable model training, fine tuning, and inference pipelines, ensuring robust integration with production systems
  • Influence technical strategy and approach to developing embedding stores, vector databases, and other reusable assets
  • Lead initiatives to streamline ML and AI workflows, improve operational efficiency, and establish standardized procedures to achieve consistent, high-quality results across our AI systems
  • Design and develop backend services and RESTful APIs using Python and FastAPI, integrating seamlessly with ML pipelines and services
  • Take operational responsibility for team-owned services, including performance monitoring, optimization, troubleshooting, and participation in an on-call rotation
  • Collaborate with both technical and non-technical colleagues, including data and applied scientists, software engineers, product managers, and business stakeholders, to deliver reliable and scalable ML-driven products
  • Coach and mentor fellow ML engineers, promoting a culture of collaboration, continuous improvement, and engineering excellence within the team
  • Employ a diverse set of tools and platforms including Python, AWS, Databricks, Docker, Kubernetes, FastAPI, Terraform, Snowflake, Coralogix, and GitHub to build, deploy, and maintain scalable, highly available machine learning infrastructure
What we offer
What we offer
  • Competitive pay
  • Comprehensive healthcare benefits
  • Financial assistance for things like hybrid work, family planning
  • Generous parental leave
  • Flexible time-off policies
  • Mental health and wellness resources
  • Learning, development, and recognition programs
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

A venture-backed startup at the intersection of AI and national security is buil...
Location
Location
United States , New York City Metropolitan Area
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong coding background in Python or Golang
  • Experience taking ML/LLM systems from prototype to production
  • Skills in Kubernetes, Docker, or cloud infrastructure (AWS preferred)
  • Someone who thrives in early-stage environments and enjoys solving hard technical problems
Job Responsibility
Job Responsibility
  • Build and scale production-grade ML services, including LLM applications
  • Develop APIs and infrastructure that integrate seamlessly into mission-critical workflows
  • Work hands-on across the stack: from systems and storage through to inference and application layers
  • Tackle complex data security and privacy challenges in real-world environments
What we offer
What we offer
  • Significant equity
  • Strong health & wellness benefits
  • Work on technology that truly matters in national security
  • Join a small, sharp team where your work will have immediate impact
  • Fulltime
Read More
Arrow Right

Engineering Manager - Machine Learning Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
241200.00 - 400000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8–10 years of experience in ML infrastructure, including direct hands-on expertise as an engineer, IC/TL
  • 2+ years of experience managing infrastructure or ML platform engineers
  • Proven experience delivering and operating ML or AI infrastructure at scale
  • Solid technical depth across ML/AI infrastructure domains (e.g., feature stores, pipelines, deployment, inference, observability)
  • Demonstrated ability to drive execution on complex technical projects with cross-team stakeholders
  • Strong communication and stakeholder management skills
Job Responsibility
Job Responsibility
  • Lead and support the ML Infra team, driving project execution and ensuring delivery on key commitments
  • Build and launch Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Define and drive adoption of an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines, deployment tooling, and inference systems
  • Partner with ML product teams to understand requirements and deliver solutions that accelerate model development and iteration
  • Recruit, mentor, and develop engineers, fostering a collaborative and high-performing team culture
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Sr. Machine Learning Engineer, AdTech

As a member of our Data Science Engineering team, the Sr. Machine Learning Engin...
Location
Location
United Kingdom
Salary
Salary:
Not provided
pulsepoint.com Logo
PulsePoint
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5 years minimum of experience in machine learning/data science
  • Key Skills: Python, Algorithms, Optimisation, NLP, Data Mining, Statistical Analysis, Neural Networks, Generalised Linear Regression, Multiclass Classification, Java, R
  • Advanced knowledge of Python using standard DS packages (numpy/pandas/scikit, etc.)
  • Being able to optimize and speed-up code.
  • 3+ years of RTB Auction or similar online technologies.
  • Algorithms and Data Structures (e.g., sorting, search tree, binary heap, trie
  • time & mem complexities of algorithms)
  • Probability and Statistics (e.g., hypothesis testing
  • Markov process and its stationary distributions, stochastic matrix and its properties
  • Bayesian inference)
Job Responsibility
Job Responsibility
  • Analyzing and optimizing real-time bidding strategies and online auction mechanics
  • Developing new or improving existing models of event predictions
  • New feature engineering for multiple machine learning models: User embeddings and clustering
  • fraud detection, etc.
  • Cross-device user identification, cookieless mechanisms development
  • Mining different data sources
  • Supporting existing codebase for data integration and production support for our core models.
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

Start.io, a leading mobile marketing and audience platform, empowers the app eco...
Location
Location
Salary
Salary:
Not provided
start.io Logo
Start.io
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related technical discipline
  • 5+ years of experience building high-performance backend or ML inference systems
  • Deep expertise in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML)
  • Experience with scalable service architecture, message queues (Kafka, Pub/Sub), and async processing
  • Strong understanding of model deployment practices, online/offline feature parity, and real-time monitoring
  • Experience in cloud environments (AWS, GCP, or OCI) and container orchestration (Kubernetes)
  • Experience working with in-memory and NoSQL databases (e.g. Aerospike, Redis, Bigtable) to support ultra-fast data access in production-grade ML services
  • Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry) and best practices for alerting and diagnostics
  • A strong sense of ownership and the ability to drive solutions end-to-end
  • Passion for performance, clean architecture, and impactful systems
Job Responsibility
Job Responsibility
  • Own and lead the design and development of low-latency Algo inference services handling billions of requests per day
  • Build and scale robust real-time decision-making engines, integrating ML models with business logic under strict SLAs
  • Collaborate closely with DS to deploy models seamlessly and reliably in production
  • Design systems for model versioning, shadowing, and A/B testing at runtime
  • Ensure high availability, scalability, and observability of production systems
  • Continuously optimize latency, throughput, and cost-efficiency using modern tooling and techniques
  • Work independently while interfacing with cross-functional stakeholders from Algo, Infra, Product, Engineering, BA & Business
What we offer
What we offer
  • Lead the mission-critical inference engine that drives our core product
  • Join a high-caliber Algo group solving real-time, large-scale, high-stakes problems
  • Work on systems where every millisecond matters, and every decision drives real value
  • Enjoy a fast-paced, collaborative, and empowered culture with full ownership of your domain
Read More
Arrow Right