CrawlJobs Logo

Member of Technical Staff, Research Tooling & Data Platform

runwayml.com Logo

Runway

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

240000.00 - 290000.00 USD / Year
Save Job
Save Icon
Job offer has expired

Job Description:

We're looking for an engineer to own Runway's internal exploratory data analysis (EDA) and evaluation platform used daily by our ML research, design, product, and creative teams. This is a high-impact role where you'll directly accelerate research velocity and enable better decision-making across the company. This platform helps researchers query large-scale datasets, run evaluations on model outputs, and analyze results - all through an intuitive interface. As the owner, you'll be responsible for the full product experience: from database query optimization and infrastructure management to building user-facing features that make complex ML workflows accessible to non-engineers.

Job Responsibility:

  • Own the EDA platform end-to-end: Take full ownership of architecture, infrastructure, feature development, and operations
  • Optimize for scale: Improve query performance and write efficiency for vector search, integrate with new data warehouses, and optimize our custom query parsing/suggestion system
  • Build for researchers: Design and ship features that help ML researchers source data faster, run more effective evaluations, and iterate quickly
  • Enable cross-functional users: Work with design, product, and creative teams to build intuitive evaluation workflows
  • Manage infrastructure: Deploy and maintain services across ECS and Kubernetes, including embedding services and database integrations
  • Provide support: Be responsive to user needs, debug issues quickly, and gather feedback to prioritize improvements

Requirements:

  • 4+ years of industry experience in a backend focused software engineering role
  • Strong experience in at least 2 of 3 areas (platform/infrastructure, ML domain knowledge, frontend/product engineering) with eagerness to learn the third
  • Platform/infrastructure: experience with vector databases, cloud primitives (i.e. SQS, ECR, Kinesis) and container orchestration (Kubernetes, ECS)
  • ML domain knowledge: Understanding of ML workflows, model training, evaluation, testing, dataset management, feature engineering, or research tooling
  • Product engineering: Ability to build clean, intuitive user experiences with product thinking and user empathy. You care deeply about building tools people love to use (TypeScript/React experience is a plus)
  • Comfortable setting up and maintaining production infrastructure and services
  • Self-starter who can navigate ambiguity and make pragmatic technical decisions
  • Humility and open mindedness

Nice to have:

TypeScript/React experience is a plus

Additional Information:

Job Posted:
January 20, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 30834 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Member of Technical Staff, Research Tooling & Data Platform

Member of Technical Staff - Platform Engineer

Platform Engineer to join our team building backend infrastructure for new ML-po...
Location
Location
United States , Palo Alto
Salary
Salary:
175000.00 - 350000.00 USD / Year
inflection.ai Logo
Inflection AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Backend engineering experience with Python, TypeScript, or Node.js
  • Hands-on experience working with production PyTorch models, model checkpoints, and inference logic
  • Strong knowledge of building APIs and services that are scalable, stable, and secure
  • Passion for bridging backend engineering and ML systems, especially at the infrastructure layer
  • Familiarity with tools such as FastAPI, Postgres, Redis, Kubernetes, and React
  • Desire to be hands-on and contribute to shaping the foundation of a new enterprise ML product
  • Have a bachelor’s degree or equivalent in a related field to the offered position requirements
Job Responsibility
Job Responsibility
  • Build and maintain backend services to support LLM integration, inference orchestration, and data flow
  • Write clean, reliable Python code for experimentation, model integration, and production systems
  • Collaborate closely with ML researchers to rapidly iterate on product ideas and deploy features
  • Design and implement infrastructure to handle scalable inference workloads and enterprise-level use cases
  • Own system components and ensure reliability, observability, and maintainability from day one
What we offer
What we offer
  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Competitive stock options
Read More
Arrow Right

Member of Technical Staff, AI Training Infrastructure

As a Training Infrastructure Engineer, you'll design, build, and optimize the in...
Location
Location
United States , San Mateo
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience
  • 3+ years of experience with distributed systems and ML infrastructure
  • Experience with PyTorch
  • Proficiency in cloud platforms (AWS, GCP, Azure)
  • Experience with containerization, orchestration (Kubernetes, Docker)
  • Knowledge of distributed training techniques (data parallelism, model parallelism, FSDP)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for large-scale model training workloads
  • Develop and maintain distributed training pipelines for LLMs and multimodal models
  • Optimize training performance across multiple GPUs, nodes, and data centers
  • Implement monitoring, logging, and debugging tools for training operations
  • Architect and maintain data storage solutions for large-scale training datasets
  • Automate infrastructure provisioning, scaling, and orchestration for model training
  • Collaborate with researchers to implement and optimize training methodologies
  • Analyze and improve efficiency, scalability, and cost-effectiveness of training systems
  • Troubleshoot complex performance issues in distributed training environments
What we offer
What we offer
  • meaningful equity in a fast-growing startup
  • comprehensive benefits package
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Cloud Infrastructure

As a Software Engineer on our Cloud Infrastructure team, you'll be at the forefr...
Location
Location
United States , New York, NY; San Mateo, CA; Redwood City, CA
Salary
Salary:
175000.00 - 220000.00 USD / Year
fireworks.ai Logo
Fireworks AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • 5+ years of experience designing and building backend infrastructure in cloud environments (e.g., AWS, GCP, Azure)
  • Proven experience in ML infrastructure and tooling (e.g., PyTorch, TensorFlow, Vertex AI, SageMaker, Kubernetes, etc.)
  • Strong software development skills in languages like Python, or C++
  • Deep understanding of distributed systems fundamentals: scheduling, orchestration, storage, networking, and compute optimization
Job Responsibility
Job Responsibility
  • Architect and build scalable, resilient, and high-performance backend infrastructure to support distributed training, inference, and data processing pipelines
  • Lead technical design discussions, mentor other engineers, and establish best practices for building and operating large-scale ML infrastructure
  • Design and implement core backend services (e.g., job schedulers, resource managers, autoscalers, model serving layers) with a focus on efficiency and low latency
  • Drive infrastructure optimization initiatives, including compute cost reduction, storage lifecycle management, and network performance tuning
  • Collaborate cross-functionally with ML, DevOps, and product teams to translate research and product needs into robust infrastructure solutions
  • Continuously evaluate and integrate cloud-native and open-source technologies (e.g., Kubernetes, Ray, Kubeflow, MLFlow) to enhance our platform’s capabilities and reliability
  • Own end-to-end systems from design to deployment and observability, with a strong emphasis on reliability, fault tolerance, and operational excellence
What we offer
What we offer
  • Meaningful equity in a fast-growing startup
  • Competitive salary
  • Comprehensive benefits package
  • Fulltime
Read More
Arrow Right
New

Systems Analyst 3 - Data Engineer

Sammons Financial Group is seeking a Systems Analyst – Data Engineer to design, ...
Location
Location
United States , Sioux Falls; West Des Moines; Chicago
Salary
Salary:
82654.00 - 172197.00 USD / Year
sammonsfinancialgroup.com Logo
Sammons Financial Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • College Degree in the field of computer science, information science, management information systems Preferred
  • Minimum 8 years' IT development experience or equivalent Preferred
  • Effective verbal and written communications skills and the ability to communicate with business partners and other IT staff
  • Problem solving skills sufficient to perform research and recommend a proposed solution to problems
  • Able to work on multiple tasks and meet established deadlines
  • Able to effectively direct and coordinate the work of other team members on a project without having HR management responsibility for them
  • Knowledge of computer programming languages as required for the system
  • Criminal background check required
Job Responsibility
Job Responsibility
  • Design, develop, and implement scalable data ingestion, integration, and processing pipelines across cloud platforms (Azure, Snowflake/ and similar EDW/Lakehouse platforms , AWS)
  • Develop and manage data orchestration workflows using tools such as Azure Data Factory (ADF), Azure Data Lake (ADLS), dbt, and comparable technologies
  • Ingest and process large volumes of structured, semi-structured, and unstructured data, including compressed formats (e.g., .tar), and automate extraction, transformation, and loading processes
  • Design and implement modern data lakehouse architectures, including Iceberg (or similar table formats), to support scalable and high-performance analytics
  • Develop and maintain data models that accurately represent complex relationships within life insurance and policy administration domains
  • Integrate enterprise data platforms with internal and external systems (e.g., APIs, Kafka, MuleSoft) to enable real-time and batch data exchange
  • Collaborate with product owners, architects, analysts, and developers to translate business, functional, and non-functional requirements into scalable technical solutions
  • Establish and enforce data engineering standards, best practices, and governance controls across ingestion, transformation, and storage layers
  • Implement data quality validation, reconciliation processes, and error handling to ensure accuracy, consistency, and reliability of data pipelines
  • Monitor pipeline performance, reliability, scalability, and cost efficiency
What we offer
What we offer
  • Comprehensive health coverage for you and your family, including Medical, Dental, Vision, HSA & FSA options, and term life insurance
  • Competitive compensation with a performance-based incentive program tied to clear goals and individual and/or company success
  • Invest in your future with our 100% company-funded Employee Stock Ownership Plan (ESOP), plus automatic enrollment in our 401(k)
  • Work–life balance that means something. Friday afternoons off year-round, generous paid time off, and paid holidays
  • Commit to your growth with paid development time, tuition reimbursement, and professional development opportunities across industry, individual, and leadership programs
  • Make an impact beyond the workplace through volunteer time off, and our company nonprofit matching gift program, supporting the causes that matter most to you
  • An ownership culture that inspires
  • join a connected, values-driven workplace where employees take accountability, support one another, and are empowered to do their best work—together shaping our future shared success
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Platform

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Chevy Chase; New York City; Palo Alto
Salary
Salary:
115000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Staff Software Engineer - AI/ML Infra

GEICO AI platform and Infrastructure team is seeking an exceptional Senior ML Pl...
Location
Location
United States , Palo Alto
Salary
Salary:
90000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Engineering, or related technical field (or equivalent experience)
  • 8+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps
  • 3+ years of hands-on experience with machine learning infrastructure and deployment at scale
  • 2+ years of experience working with Large Language Models and transformer architectures
  • Proficient in Python
  • strong skills in Go, Rust, or Java preferred
  • Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.)
  • Proficient in Kubernetes including custom operators, helm charts, and GPU scheduling
  • Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking)
  • Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions)
Job Responsibility
Job Responsibility
  • Design and implement scalable infrastructure for training, fine-tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.)
  • Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization
  • Design, implement, and maintain feature stores for ML model training and inference pipelines
  • Build and optimize LLM inference systems using frameworks like vLLM, TensorRT-LLM, and custom serving solutions
  • Ensure 99.9%+ uptime for ML platforms through robust monitoring, alerting, and incident response procedures
  • Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances
  • Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps
  • Implement cost-effective solutions for GPU compute, storage, and networking across Azure regions
  • Ensure ML platforms meet enterprise security standards and regulatory compliance requirements
  • Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Data Scientist

The Data Scientist plays a pivotal role in planning, executing, and delivering m...
Location
Location
United States , Camden
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s, or PhD in Computer Science, Data Science, Engineering, Statistics, Applied Mathematics, Operations Research, or a related quantitative field
  • Specialization in ML, AI, cognitive science, or data science is highly preferred
  • 3-5 years of hands-on experience planning and executing end-to-end data science projects with demonstrated impact on clinical or operational outcomes in business environments
  • Advanced programming proficiency in Python or R with strong expertise in machine learning frameworks (scikit-learn, TensorFlow, PyTorch) and statistical analysis tools
  • Expertise in machine learning and statistical techniques including supervised/unsupervised learning, deep learning, NLP, computer vision, regression models, ensemble methods, and experimental design (A/B testing)
  • Strong data engineering capabilities including SQL/NoSQL database programming, distributed computing tools (Hadoop, Spark, Kafka), data pipeline development, and experience with cloud platforms (AWS, Azure, GCP)
  • Production ML and MLOps experience including model deployment, monitoring, containerization (Docker, Kubernetes), version control, and applying DevOps principles to data science workflows
  • Data visualization and communication excellence with ability to create compelling dashboards (Tableau, Power BI), translate complex technical findings into actionable insights, and present to diverse audiences from executives to frontline staff
  • Cross-functional collaboration skills with proven ability to work in agile environments, partner with stakeholders to align technical solutions with business objectives, and mentor junior team members
  • Healthcare domain knowledge preferred, particularly experience with Epic EHR systems, clinical workflows, and healthcare data standards, along with relevant certifications (Clarity /Caboodle, Google Cloud ML Engineer, AWS ML Specialist)
Job Responsibility
Job Responsibility
  • Collect, clean, and analyze datasets from diverse internal and external sources, applying advanced data wrangling techniques
  • Acquire access to various databases and source systems (SQL, NoSQL, graph databases) and create data pipelines
  • Apply statistical analysis and visualization techniques to explore and prepare data
  • Design, develop, and validate machine learning, statistical, and optimization models
  • Select appropriate algorithms and models for AI/ML and test them for accuracy, robustness, and fairness
  • Perform feature selection and engineering
  • Integrate domain knowledge into ML solutions
  • Conduct controlled experiments (A/B and multivariate testing)
  • Collaborate with MLOps, data engineers, and IT to evaluate deployment options
  • Continuously monitor execution and health of production ML models
  • Fulltime
Read More
Arrow Right