CrawlJobs Logo

Staff Observability Data Infrastructure Engineer

United States, Work at Home, Maryland Employment contract 130295.00 - 260590.00 USD / Year · Job Posted April 24, 2026
Apply Position
Job Link Share

Job Description

CVS Health is seeking a highly skilled Observability Data Infrastructure Engineer to join our Enterprise Observability and Security Engineering team. This individual contributor role is responsible for building, scaling, and operationalizing an enterprise Observability Lakehouse that enables threat detection, incident response, and platform visibility across hybrid and multi-cloud environments.

Job Responsibility

  • Design, build, and operate high-volume log, metric, and trace pipelines using Databricks, cloud data lakes, and distributed processing engines
  • Architect and evolve an Observability Lakehouse aligned with OpenTelemetry (OTEL) data models and standards
  • Implement ingestion and transformation workflows using technologies such as Cribl, Vector, Jenkins, GitHub Actions, or equivalent tools
  • Normalize, model, and enrich telemetry data to support detection engineering, forensics, and operational analytics
  • Develop scalable ETL/ELT frameworks, Delta Lake architectures, and automated data quality validation for unstructured and semi-structured data
  • Partner with Security Engineering, SRE, Cloud, and SOC teams to improve enterprise visibility and detection accuracy
  • Build and maintain CI/CD pipelines and reusable Infrastructure-as-Code (IaC) patterns for observability platform deployment
  • Identify and resolve performance, latency, cost, and reliability issues across telemetry pipelines
  • Contribute to engineering standards, documentation, and knowledge sharing across observability and security platforms

Requirements

  • 7+ years of experience building and operating log, metric, and trace pipelines in Data, Security Data, or Observability Engineering roles
  • 5+ years of hands-on experience with Databricks, Apache Spark, or other large-scale distributed data platforms
  • 5+ years of experience working across cloud platforms (AWS, Azure, or GCP), including storage, compute, and event-driven services
  • 5+ years of production experience using SQL and Python in data-intensive environments
  • 3+ years of experience with enterprise observability platforms (Splunk, Datadog, Elastic, or equivalent)
  • 3+ years of experience with high-throughput ingestion and streaming technologies such as Cribl, Vector, or Kafka
  • 3+ years of experience designing telemetry systems aligned to OpenTelemetry (OTEL) or similar standards
  • Bachelor's degree from accredited university or equivalent work experience (HS diploma + 4 years relevant experience)

Nice to have

  • Background supporting SIEM/SOAR platforms, detection engineering, or threat analytics
  • Familiarity with Delta Lake, Unity Catalog, metadata management, and data lineage
  • Understanding of security governance, auditing, access controls, and sensitive data handling
  • Hands-on experience with Infrastructure as Code (Terraform, ARM/Bicep, CloudFormation)
  • Familiarity with cloud-native compute and orchestration services (Azure Functions, AWS Lambda, GCP Cloud Functions, Kubernetes)
  • Strong communication skills with the ability to engage both engineering teams and senior stakeholders
  • Demonstrated passion for observability, security, reliability, and continuous learning

What we offer

  • Medical, dental, and vision coverage
  • Paid time off
  • Retirement savings options
  • Wellness programs
  • Bonus, commission or short-term incentive program
  • Equity award program

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Staff Observability Data Infrastructure Engineer

8 matching positions

Staff Data Engineer - Vehicle Telemetry and Data Infrastructure

We are looking for a Staff Data Engineer to own the telemetry data platform for ...
Location
Location
United States , Palo Alto
Salary
Salary:
230000.00 - 250000.00 USD / Year
ridealso.com Logo
ALSO
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in data engineering and/or backend platform engineering operating production systems at scale
  • Deep hands-on experience with large-scale telemetry or IoT data, including high-throughput and low-latency ingestion
  • Strong expertise in AWS data and infrastructure services (S3, Kinesis/MSK, Glue, EMR, Lambda, Step Functions, EventBridge)
  • Proven experience owning end-to-end ETL/ELT infrastructure using Spark/PySpark (batch and streaming) on Databricks or EMR
  • Solid understanding of streaming architectures using Kafka or equivalent systems and time-series–optimized storage patterns
  • Strong backend engineering skills using Python and/or Java/Scala, including API design (REST/gRPC) and distributed systems fundamentals
  • Experience with data platform architectures such as data lakes and lakehouses, schema registries, and metadata systems
  • Hands-on experience with orchestration frameworks (Airflow, MWAA, Dagster) and production-grade observability (logging, metrics, tracing)
  • Infrastructure-as-code expertise using CloudFormation, Terraform, or CDK to manage scalable and reliable systems
  • A track record of building highly reliable, fault-tolerant systems with clear ownership, strong SLAs, and operational excellence
Job Responsibility
Job Responsibility
  • Design and own large-scale ingestion pipelines for vehicle telemetry data (events, metrics, time-series) with high throughput and low latency
  • Architect and operate end-to-end ETL/ELT systems from raw ingestion to warehouse/lake consumption
  • Define schema evolution, versioning, and backward-compatibility strategies for telemetry data at scale
  • Build safe and repeatable backfill, replay, and reprocessing mechanisms for historical and real-time data
  • Design data storage and lifecycle strategies across hot, warm, and cold paths to balance cost and performance
  • Develop fault-tolerant, observable, and debuggable pipelines with strong SLAs around freshness, completeness, and latency
  • Implement backend services and APIs for telemetry ingestion, configuration management, metadata, and orchestration
  • Apply strong software engineering practices including object-oriented design, automated testing, CI/CD, and code reviews
  • Establish automated data quality checks, anomaly detection, alerting, lineage, and auditability across the platform
  • Provide technical leadership by setting platform direction, reviewing designs, mentoring engineers, and influencing product and engineering roadmaps
What we offer
What we offer
  • Robust health coverage. Excellent health, dental and vision insurance covered up to 100% by ALSO with FSA & HSA options
  • One Medical membership and dedicated insurance advocates
  • Rich fertility and family building benefits with Progyny
  • Flexible time off
  • 401(k) match
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer- Data Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adevinta.com Logo
Adevinta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of hands-on experience in Software Development or Data Engineering
  • at least 5+ years specifically focused on building Data Platforms
  • deep understanding of how Platform infra supports Analytics workloads
  • proven experience evolving complex platforms from legacy patterns to modern, cloud-native solutions
  • deep knowledge of Spark internals, JVM tuning, and performance optimization for high-scale batch and streaming datasets
  • deep expertise in Unity Catalog, Delta Lake internals, and optimizing high-volume workloads
  • strict software engineering discipline (CI/CD, Testing, OOP) applied to data pipelines
  • understanding of microservices architecture
  • understanding the needs of Analytics/DWH teams (Data Modeling, dbt)
  • strong background in building automated pipelines using Terraform/Terragrunt and ensuring system observability
Job Responsibility
Job Responsibility
  • Lead the evolution of our Data Platform and architect the "Data Exchange" strategy
  • define robust patterns for API-based ingestion, Event-Driven Architectures (Kafka), and Reverse ETL
  • ensure architectures are optimized for cost and performance on AWS
  • act as a catalyst for technical evolution
  • constantly scan the horizon for next-generation technologies
  • lead the implementation of new paradigms
  • design the strategy for Unity Catalog implementation and Data Contracts
  • champion FinOps, automating cost controls for our highest-volume workloads
  • build the underlying infrastructure that allows Analytics/DWH teams to run efficient transformations
  • elevate the technical bar of the team, mentoring Staff and Senior engineers
What we offer
What we offer
  • An attractive Base Salary
  • Participation in our Short Term Incentive plan (annual bonus)
  • Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
  • A 24/7 Employee Assistance Program for you and your family
  • a collaborative environment with an opportunity to explore your potential and grow
  • a range of locally relevant benefits
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adevinta.com Logo
Adevinta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
  • Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
  • Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
  • Strong background in building automated pipelines and ensuring system observability
  • Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
  • Solid experience serving models at low latency and high concurrency using containerized solutions
  • Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems
  • Expert level Python
  • Experience with PyTorch, Terraform, Terragrunt, Docker, Kubernetes, GitHub Actions, Datadog
  • Experience with Databricks AI Stack: MLflow, Mosaic AI, Unity Catalog, Feature Store, Databricks Model Serving, Vector Databases
Job Responsibility
Job Responsibility
  • Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
  • Act as a first mover: validate new Databricks features and integrate them into the platform
  • Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
  • Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
  • Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
  • Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps
What we offer
What we offer
  • An attractive Base Salary
  • Participation in our Short Term Incentive plan (annual bonus)
  • Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
  • A 24/7 Employee Assistance Program for you and your family
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer - Agentic AI

As a Senior Staff Data Engineer – Agentic AI, you will operate as a senior indiv...
Location
Location
India , Bengaluru Urban; CHENNAI
Salary
Salary:
Not provided
americanexpress.com Logo
Amex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12+ years of experience building large-scale distributed systems
  • Strong experience with LLM systems, agentic workflows or advanced ML infrastructure
  • Proven ownership of complex, cross-cutting agentic systems spanning multiple teams or products
  • Strong engineering fundamentals across backend systems, APIs, data pipelines, and cloud infrastructure
  • Deep experience across the agentic AI stack, including planning, tool use, memory, and evaluation
  • Fluency with AI-assisted and agentic development workflows
  • Comfort operating in ambiguous problem spaces and translating them into shipped, reliable autonomous systems
  • Ability to influence technical direction and align teams without formal authority
  • Experience in workflow engines, async processing, queues, and streaming systems
Job Responsibility
Job Responsibility
  • Drive technical direction for agentic AI initiatives, influencing architecture patterns, autonomy boundaries, and system design
  • Design, build, and operate production-grade agentic AI systems used across multiple products
  • Own and evolve shared agentic AI capabilities, including: Agent frameworks and orchestration layers, Planning, tool use, and memory strategies, Retrieval and grounding (RAG) pipelines, LLM infrastructure, inference, and model gateways, Evaluation, observability, and safety tooling for autonomous systems
  • Lead technical design reviews and help teams navigate tradeoffs involving autonomy, safety, reliability, scalability, and cost
  • Partner across teams to deliver complex, cross-cutting agentic AI initiatives from concept to production
  • Evaluate emerging models, techniques, and agentic patterns and translate them into practical, enterprise-ready improvements
  • Mentor senior engineers and raise the technical bar for agentic AI development through example and influence
What we offer
What we offer
  • Competitive base salaries
  • Bonus incentives
  • Support for financial-well-being and retirement
  • Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location)
  • Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
  • Generous paid parental leave policies (depending on your location)
  • Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
  • Free and confidential counseling support through our Healthy Minds program
  • Career development and training opportunities
  • Fulltime
Read More
Arrow Right

Staff Data Engineer - Agentic AI

As a Staff Data Engineer – Agentic AI, you will operate as a senior individual c...
Location
Location
India , Bengaluru Urban; CHENNAI
Salary
Salary:
Not provided
americanexpress.com Logo
Amex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience building large-scale distributed systems
  • Strong experience with LLM systems, agentic workflows or advanced ML infrastructure
  • Proven ownership of complex, cross-cutting agentic systems spanning multiple teams or products
  • Strong engineering fundamentals across backend systems, APIs, data pipelines, and cloud infrastructure
  • Deep experience across the agentic AI stack, including planning, tool use, memory, and evaluation
  • Fluency with AI-assisted and agentic development workflows
  • Comfort operating in ambiguous problem spaces and translating them into shipped, reliable autonomous systems
  • Ability to influence technical direction and align teams without formal authority
  • Experience in workflow engines, async processing, queues, and streaming systems
Job Responsibility
Job Responsibility
  • Drive technical direction for agentic AI initiatives, influencing architecture patterns, autonomy boundaries, and system design
  • Design, build, and operate production-grade agentic AI systems used across multiple products
  • Own and evolve shared agentic AI capabilities, including Agent frameworks and orchestration layers, Planning, tool use, and memory strategies, Retrieval and grounding (RAG) pipelines, LLM infrastructure, inference, and model gateways, Evaluation, observability, and safety tooling for autonomous systems
  • Lead technical design reviews and help teams navigate tradeoffs involving autonomy, safety, reliability, scalability, and cost
  • Partner across teams to deliver complex, cross-cutting agentic AI initiatives from concept to production
  • Evaluate emerging models, techniques, and agentic patterns and translate them into practical, enterprise-ready improvements
  • Mentor senior engineers and raise the technical bar for agentic AI development through example and influence
What we offer
What we offer
  • Competitive base salaries
  • Bonus incentives
  • Support for financial-well-being and retirement
  • Comprehensive medical, dental, vision, life insurance, and disability benefits (depending on location)
  • Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
  • Generous paid parental leave policies (depending on your location)
  • Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
  • Free and confidential counseling support through our Healthy Minds program
  • Career development and training opportunities
Read More
Arrow Right

Staff Data Engineer

A VC-backed retail AI scale-up is expanding its engineering team and is looking ...
Location
Location
United States
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in software development and data engineering with ownership of production-grade systems
  • Proven expertise in Spark (Databricks, EMR, or similar) and scaling it in production
  • Strong knowledge of distributed computing and modern data modeling approaches
  • Solid programming skills in Python, with an emphasis on clean, maintainable code
  • Hands-on experience with SQL and NoSQL databases (e.g., PostgreSQL, DynamoDB, Cassandra)
  • Excellent communicator who can influence and partner across teams
Job Responsibility
Job Responsibility
  • Design and evolve distributed, cloud-based data infrastructure that supports both real-time and batch processing at scale
  • Build high-performance data pipelines that power analytics, AI/ML workloads, and integrations with third-party platforms
  • Champion data reliability, quality, and observability, introducing automation and monitoring across pipelines
  • Collaborate closely with engineering, product, and AI teams to deliver data solutions for business-critical initiatives
What we offer
What we offer
  • Fully remote
  • great equity
Read More
Arrow Right

Staff Data Engineer

A VC-backed retail AI scale-up is expanding its engineering team and is looking ...
Location
Location
United States
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in software development and data engineering with ownership of production-grade systems
  • Proven expertise in Spark (Databricks, EMR, or similar) and scaling it in production
  • Strong knowledge of distributed computing and modern data modeling approaches
  • Solid programming skills in Python, with an emphasis on clean, maintainable code
  • Hands-on experience with SQL and NoSQL databases (e.g., PostgreSQL, DynamoDB, Cassandra)
  • Excellent communicator who can influence and partner across teams
Job Responsibility
Job Responsibility
  • Design and evolve distributed, cloud-based data infrastructure that supports both real-time and batch processing at scale
  • Build high-performance data pipelines that power analytics, AI/ML workloads, and integrations with third-party platforms
  • Champion data reliability, quality, and observability, introducing automation and monitoring across pipelines
  • Collaborate closely with engineering, product, and AI teams to deliver data solutions for business-critical initiatives
What we offer
What we offer
  • Fully remote
  • great equity
Read More
Arrow Right

Staff Data Engineer

We are seeking a Staff Data Engineer to architect and lead our entire data infra...
Location
Location
United States , New York; San Francisco
Salary
Salary:
170000.00 - 210000.00 USD / Year
taskrabbit.com Logo
Taskrabbit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7-10 years of experience in Data Engineering
  • Expertise in building and maintaining ELT data pipelines using modern tools such as dbt, Airflow, and Fivetran
  • Deep experience with cloud data warehouses such as Snowflake, BigQuery, or Redshift
  • Strong data modeling skills (e.g., dimensional modeling, star/snowflake schemas) to support both operational and analytical workloads
  • Proficient in SQL and at least one general-purpose programming language (e.g., Python, Java, or Scala)
  • Experience with streaming data platforms (e.g., Kafka, Kinesis, or equivalent) and real-time data processing patterns
  • Familiarity with infrastructure-as-code tools like Terraform and DevOps practices for managing data platform components
  • Hands-on experience with BI and semantic layer tools such as Looker, Mode, Tableau, or equivalent
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable, reliable data pipelines and infrastructure to support analytics, operations, and product use cases
  • Develop and evolve dbt models, semantic layers, and data marts that enable trustworthy, self-serve analytics across the business
  • Collaborate with non-technical stakeholders to deeply understand their business needs and translate them into well-defined metrics and analytical tools
  • Lead architectural decisions for our data platform, ensuring it is performant, maintainable, and aligned with future growth
  • Build and maintain data orchestration and transformation workflows using tools like Airflow, dbt, and Snowflake (or equivalent)
  • Champion data quality, documentation, and observability to ensure high trust in data across the organization
  • Mentor and guide other engineers and analysts, promoting best practices in both data engineering and analytics engineering disciplines
What we offer
What we offer
  • Employer-paid health insurance
  • 401k match with immediate vesting
  • Generous and flexible time off with 2 company-wide closure weeks
  • Taskrabbit product stipends
  • Wellness + productivity + education stipends
  • IKEA discounts
  • Reproductive health support
  • Fulltime
Read More
Arrow Right