CrawlJobs Logo

Member of Technical Staff, Research Tooling & Data Platform

runwayml.com Logo

Runway

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

240000.00 - 290000.00 USD / Year

Job Description:

We're looking for an engineer to own Runway's internal exploratory data analysis (EDA) and evaluation platform used daily by our ML research, design, product, and creative teams. This is a high-impact role where you'll directly accelerate research velocity and enable better decision-making across the company. This platform helps researchers query large-scale datasets, run evaluations on model outputs, and analyze results - all through an intuitive interface. As the owner, you'll be responsible for the full product experience: from database query optimization and infrastructure management to building user-facing features that make complex ML workflows accessible to non-engineers.

Job Responsibility:

  • Own the EDA platform end-to-end: Take full ownership of architecture, infrastructure, feature development, and operations
  • Optimize for scale: Improve query performance and write efficiency for vector search, integrate with new data warehouses, and optimize our custom query parsing/suggestion system
  • Build for researchers: Design and ship features that help ML researchers source data faster, run more effective evaluations, and iterate quickly
  • Enable cross-functional users: Work with design, product, and creative teams to build intuitive evaluation workflows
  • Manage infrastructure: Deploy and maintain services across ECS and Kubernetes, including embedding services and database integrations
  • Provide support: Be responsive to user needs, debug issues quickly, and gather feedback to prioritize improvements

Requirements:

  • 4+ years of industry experience in a backend focused software engineering role
  • Strong experience in at least 2 of 3 areas (platform/infrastructure, ML domain knowledge, frontend/product engineering) with eagerness to learn the third
  • Platform/infrastructure: experience with vector databases, cloud primitives (i.e. SQS, ECR, Kinesis) and container orchestration (Kubernetes, ECS)
  • ML domain knowledge: Understanding of ML workflows, model training, evaluation, testing, dataset management, feature engineering, or research tooling
  • Product engineering: Ability to build clean, intuitive user experiences with product thinking and user empathy. You care deeply about building tools people love to use (TypeScript/React experience is a plus)
  • Comfortable setting up and maintaining production infrastructure and services
  • Self-starter who can navigate ambiguity and make pragmatic technical decisions
  • Humility and open mindedness

Nice to have:

TypeScript/React experience is a plus

Additional Information:

Job Posted:
January 20, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff, Research Tooling & Data Platform

Member of Technical Staff - Platform Engineer

Platform Engineer to join our team building backend infrastructure for new ML-po...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Backend engineering experience with Python, TypeScript, or Node.js
  • Hands-on experience working with production PyTorch models, model checkpoints, and inference logic
  • Strong knowledge of building APIs and services that are scalable, stable, and secure
  • Passion for bridging backend engineering and ML systems, especially at the infrastructure layer
  • Familiarity with tools such as FastAPI, Postgres, Redis, Kubernetes, and React
  • Desire to be hands-on and contribute to shaping the foundation of a new enterprise ML product
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Build and maintain backend services to support LLM integration, inference orchestration, and data flow
  • Write clean, reliable Python code for experimentation, model integration, and production systems
  • Collaborate closely with ML researchers to rapidly iterate on product ideas and deploy features
  • Design and implement infrastructure to handle scalable inference workloads and enterprise-level use cases
  • Own system components and ensure reliability, observability, and maintainability from day one
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of Technical Staff, AI Training Infrastructure

As a Training Infrastructure Engineer, you'll design, build, and optimize the in...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience
  • 3+ years of experience with distributed systems and ML infrastructure
  • Experience with PyTorch
  • Proficiency in cloud platforms (AWS, GCP, Azure)
  • Experience with containerization, orchestration (Kubernetes, Docker)
  • Knowledge of distributed training techniques (data parallelism, model parallelism, FSDP)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for large-scale model training workloads
  • Develop and maintain distributed training pipelines for LLMs and multimodal models
  • Optimize training performance across multiple GPUs, nodes, and data centers
  • Implement monitoring, logging, and debugging tools for training operations
  • Architect and maintain data storage solutions for large-scale training datasets
  • Automate infrastructure provisioning, scaling, and orchestration for model training
  • Collaborate with researchers to implement and optimize training methodologies
  • Analyze and improve efficiency, scalability, and cost-effectiveness of training systems
  • Troubleshoot complex performance issues in distributed training environments
What we offer
What we offer
  • meaningful equity in a fast-growing startup
  • comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...
Location
Location
United States , New York, NY; San Mateo, CA; Redwood City, CA
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Strong software development skills in languages like Python, or C++
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization
Job Responsibility
Job Responsibility
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Platform

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Feasibility Specialist

The Feasibility Specialist is responsible for supporting feasibility processes f...
Location
Location
Australia
Salary
Salary:
Not provided
parexel.com Logo
Parexel
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Korean/ English > Business level proficiency is a must
  • Supporting the Feasibility, Strategy, & Analytics Lead (FSAL) to ensure knowledge of the goals, scope and requirements of a Site Feasibility project, and ensures that high quality insights are delivered
  • Clinical Systems Support: Administer and maintain clinical trial management systems (CTMS) and other relevant Feasibility tools
  • Data Analysis: Perform research and data analysis to identify suitable clinical trial sites.
  • Feasibility Activities: Assist with the setup of feasibility studies, including outreach to potential sites.
  • Stakeholder Coordination: Communicate and coordinate with FSAL, site staff, and other stakeholders to support feasibility efforts
  • Troubleshooting: Address and resolve any issues or discrepancies in the feasibility process as they arise or as directed
  • Acts as a supportive team member for Regional Intelligence
  • Performs tasks for multiple Site Intelligence and Feasibility projects.
  • Able to manage a high volume of complex studies and sites
Job Responsibility
Job Responsibility
  • Supporting feasibility processes for clinical trials
  • Conducting research and data analysis to identify potential Clinical Trial Investigators and sites
  • Ensuring data accuracy in clinical systems
  • Collaborating with various internal stakeholders to assist in optimizing study design and execution
  • Engaging with clinical trial sites/Investigators to gather feasibility survey or additional requested data and ensure accurate and comprehensive responses.
Read More
Arrow Right
New

Data Scientist

The Data Scientist plays a pivotal role in planning, executing, and delivering m...
Location
Location
United States , Camden
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s, or PhD in Computer Science, Data Science, Engineering, Statistics, Applied Mathematics, Operations Research, or a related quantitative field
  • Specialization in ML, AI, cognitive science, or data science is highly preferred
  • 3-5 years of hands-on experience planning and executing end-to-end data science projects with demonstrated impact on clinical or operational outcomes in business environments
  • Advanced programming proficiency in Python or R with strong expertise in machine learning frameworks (scikit-learn, TensorFlow, PyTorch) and statistical analysis tools
  • Expertise in machine learning and statistical techniques including supervised/unsupervised learning, deep learning, NLP, computer vision, regression models, ensemble methods, and experimental design (A/B testing)
  • Strong data engineering capabilities including SQL/NoSQL database programming, distributed computing tools (Hadoop, Spark, Kafka), data pipeline development, and experience with cloud platforms (AWS, Azure, GCP)
  • Production ML and MLOps experience including model deployment, monitoring, containerization (Docker, Kubernetes), version control, and applying DevOps principles to data science workflows
  • Data visualization and communication excellence with ability to create compelling dashboards (Tableau, Power BI), translate complex technical findings into actionable insights, and present to diverse audiences from executives to frontline staff
  • Cross-functional collaboration skills with proven ability to work in agile environments, partner with stakeholders to align technical solutions with business objectives, and mentor junior team members
  • Healthcare domain knowledge preferred, particularly experience with Epic EHR systems, clinical workflows, and healthcare data standards, along with relevant certifications (Clarity /Caboodle, Google Cloud ML Engineer, AWS ML Specialist)
Job Responsibility
Job Responsibility
  • Collect, clean, and analyze datasets from diverse internal and external sources, applying advanced data wrangling techniques
  • Acquire access to various databases and source systems (SQL, NoSQL, graph databases) and create data pipelines
  • Apply statistical analysis and visualization techniques to explore and prepare data
  • Design, develop, and validate machine learning, statistical, and optimization models
  • Select appropriate algorithms and models for AI/ML and test them for accuracy, robustness, and fairness
  • Perform feature selection and engineering
  • Integrate domain knowledge into ML solutions
  • Conduct controlled experiments (A/B and multivariate testing)
  • Collaborate with MLOps, data engineers, and IT to evaluate deployment options
  • Continuously monitor execution and health of production ML models
  • Fulltime
Read More
Arrow Right