CrawlJobs Logo

Senior Network Observability Engineer

https://www.marriott.com Logo

Marriott Bonvoy

Location Icon

Location:
United States , Bethesda

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

48.26 - 82.54 USD / Hour
Save Job
Save Icon
Job offer has expired

Job Description:

The Senior Network Observability Engineer, Network Reliability Engineering (NRE) is the subject matter expert in designing and implementing the Network Observability strategy and platforms for the next-gen operations and engineering for all Marriott International (MI) networks including the Property Networks, Datacenter, Corporate and Client Networks and multi-cloud environments into a proactive, telemetry-driven ecosystem.

Job Responsibility:

  • Designing and implementing the Network Observability strategy and platforms for the next-gen operations and engineering for all Marriott International (MI) networks
  • Architecting solutions that leverage AI/ML-driven insights, real-time telemetry, and automation frameworks to predict, prevent, and resolve network issues
  • Collaborate with network architects, product owners, and global operations teams to define and enforce observability standards, build automation pipelines, and deliver actionable intelligence
  • Bridge traditional NOC practices with modern AI, ML and SRE methodologies

Requirements:

  • Bachelor’s degree in computer science, Network Engineering, or related discipline
  • advanced certifications (CCNP, AWS and Azure Networking Specialty) strongly preferred
  • 8+ years of progressive experience in network observability, telemetry engineering, and performance optimization for large-scale, mission-critical environments
  • Proven expertise in collecting, processing, and correlating telemetry data (NetFlow, IPFIX, SNMP, streaming telemetry) to enable predictive analytics and proactive incident prevention
  • Hands-on experience with enterprise-grade observability, Saas and Security platforms, including Selector.ai, NetScout, NetBrain, ThousandEyes, BigPanda, and other AI/ML-driven monitoring solutions
  • Demonstrated ability to install, configure, and optimize observability tools, integrate APIs, and build automation workflows for anomaly detection and remediation
  • Strong proficiency in administration of network tools and policy enforcement, including role-based access control and compliance frameworks
  • Expertise in developing observability requirements, architecture designs, and implementation roadmaps, ensuring alignment with SRE principles and Agile delivery models
  • Deep understanding of foundational networking protocols and technologies (ARP, TCP/IP, UDP, DHCP, DNS, NAT) and advanced routing protocols (OSPF, BGP)
  • Hands on experience with Palo, Prisma, and SDWAN Strata Cloud Manager, Including routing and switching platforms (Cisco, Juniper, HP/Aruba)
  • Demonstrated experience in delivering written documents, including detailed network solutions and architecture diagrams
  • Experience with one or more Cloud Computing platforms (Amazon AWS, Microsoft Azure, Google Cloud Platform)
  • Experience in Agile and DevOps practices, including sprint planning, backlog grooming, and embedding observability into CI/CD pipelines
  • Ability to design custom dashboards, KPIs, and alerting strategies for real-time visibility and executive reporting

Nice to have:

  • Advanced Degree (MS, PhD) in Computer Science, Network Engineering, or MBA with a technology focus
  • Experience managing network observability tools in hospitality or global enterprise environments
  • Proficiency in leveraging public APIs for automation and integration with observability platforms
  • Strong ability to collaborate across cross-functional teams in multiple time zones, driving alignment and execution
  • Demonstrated experience in researching emerging technologies, standards, and trends and translating them into actionable roadmaps
  • Deep knowledge of next-generation observability tools and frameworks, including Selector.ai, NetScout, NetBrain, ThousandEyes, and AI Ops platforms
  • Proven ability to design and implement automation for network instrumentation and monitoring, using scripting languages (Python, REST APIs)
  • Excellent problem-solving skills, capable of working independently and leading outcomes for distributed teams
  • Strong understanding of change management, testing methodologies, and high-availability strategies for critical platforms
  • Ability to manage multiple priorities effectively, with exceptional attention to detail
  • Track record of driving transformation in network technologies and observability practices through data-driven continuous improvement
  • Experience improving reliability, performance, and agility of complex enterprise networks
  • Expertise in network infrastructure automation, instrumentation, and emerging observability technologies
  • Strong influencing and leadership skills to overcome barriers and drive organizational change
  • Exceptional verbal and written communication skills, including executive-level presentations and technical documentation
What we offer:
  • 401(k) plan
  • stock purchase plan
  • discounts at Marriott properties
  • commuter benefits
  • employee assistance plan
  • childcare discounts
  • medical, dental, vision coverage
  • health care flexible spending account
  • dependent care flexible spending account
  • life insurance
  • disability insurance
  • accident insurance
  • adoption expense reimbursements
  • paid parental leave
  • paid sick leave
  • PTO
  • holidays

Additional Information:

Job Posted:
April 17, 2026

Expiration:
April 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Network Observability Engineer

Senior DevOps Engineer (Observability)

You will enable our machine learning team, data engineers, and applications team...
Location
Location
United States , New York
Salary
Salary:
180000.00 - 225000.00 USD / Year
evolutioniq.com Logo
EvolutionIQ
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of DevOps experience
  • Extensive experience designing and running production systems on GCP
  • Deep exposure and familiarity to networking concepts, Kubernetes clusters, Docker, containerized development, Terraform, Helm, Dagster (DE), and ArgoCD
  • Experience with production operations and working with product engineering teams
  • Experience integrating with SIEM and security software, such as vulnerability scanners
  • You know the critical questions to ask in order to understand a client’s business problem and can show the business impact of your technical solutions
  • Team player who is solutions-oriented
  • You have crisp written and verbal communication skills
Job Responsibility
Job Responsibility
  • Improve and further our observability stack across GCP infrastructure and applications
  • Drive consistency and operational excellence across all teams
  • Enable the data engineering team to use Dagster efficiently
  • Leverage tools like Terraform, Github Actions, Helm, and ArgoCD to build efficient infrastructure as code pipelines
  • Ensure industry standard security controls in our cloud environments
  • Institute culture of reliability in a federated ownership environment
What we offer
What we offer
  • Medical, dental, vision, short & long-term disability, life insurance and AD&D, and 401k matching
  • Additional family, wellness, and pet benefits
  • Paid time off and sick leave, 100% paid parental leave (16 weeks for primary caregivers and 12 weeks for secondary caregivers)
  • We offer a flexible schedule for new parents returning to work
  • Catered lunches, happy hours, pet-friendly spaces, and monthly technology stipend
  • $1,000/year for each employee for professional development, as well opportunities for tuition reimbursement
  • An annual bonus plan and company equity plan (RSUs) are also included in our compensation package
  • Fulltime
Read More
Arrow Right

Senior Information Technology Engineer

The IT Systems Engineer is responsible for architecting, securing, and scaling L...
Location
Location
India , Pune
Salary
Salary:
Not provided
logicmonitor.com Logo
LogicMonitor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of IT experience in a global high-tech environment
  • 5+ years of hands-on networking experience (enterprise/global scale)
  • Strong experience managing Cisco switches, Fortinet firewalls, VPNs, and wireless infrastructure
  • Demonstrated experience with Zero-Trust Network Architecture (ZTNA), Secure Web Gateways, and CASB (preferred: Cloudflare)
  • Proficiency with Terraform for Infrastructure-as-Code and familiarity with GitOps practices
  • Strong understanding of networking in cloud environments (AWS, GCP, Azure)
  • Familiarity with FedRAMP/GovCloud requirements preferred
  • Experience using AI tools to enhance productivity, innovation, or problem-solving
  • Solid Linux systems experience
  • macOS networking and certificate compatibility knowledge required
Job Responsibility
Job Responsibility
  • Own Cloudflare ZTNA and Secure Web Gateway end-to-end: design, policy enforcement, monitoring, troubleshooting, and Terraform-based configuration
  • Handle multiple instances of Cloudflare ZTNA, covering commercial and government infrastructure
  • Ensure compatibility and reliability of certificates and macOS networking with SWG/Zero-Trust controls
  • Architect and administer global networking across offices, data centers, and multi-cloud (AWS, GCP, Azure) environments
  • Manage Cisco switches, Fortinet firewalls, VPNs, Wi-Fi, and global remote access infrastructure
  • Implement Infrastructure-as-Code practices with Terraform and support GitOps workflows
  • Deliver and maintain network observability dashboards, SLAs, and uptime reporting using LogicMonitor
  • Partner with Security and Technical Operations to maintain compliance in both commercial and FedRAMP environments
  • Ability to work within an on-call rotation schedule and be available after hours for specialized support
  • Proactively identify opportunities for AI-driven automation within IT operations and quietly deliver solutions that reduce manual workloads
Read More
Arrow Right

Senior Principal Backend Engineer

As an Observability Architect for the Platform Engineering team, you will collab...
Location
Location
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Previous experience in building and managing large scale telemetry systems, OTEL, TSDB
  • Previous experience building large scale data ingestion pipelines
  • Software development in Java, Python
  • Serious analytical skills across different levels of the stack: Network, .Net/Java, Operating System
Job Responsibility
Job Responsibility
  • Regularly tackle the largest and most complex problems on the team, from technical design to Solution
  • Deliver solutions that are used by other teams and products
  • Determine plans-of-attack on large projects
  • Routinely tackle complex architecture challenges and apply architectural standards and start using them on new projects
  • Lead code reviews and documentation, as well as take on complex bug fixes, especially on high-risk problems
  • Set the standard for thorough, meaningful code reviews
  • Partner across Engineering teams to take on company-wide initiatives spanning multiple projects
  • Transfer your depth of knowledge from your current language to excel as a Software Engineer
  • Mentor more junior members of the team
What we offer
What we offer
  • Health and wellbeing resources
  • Paid volunteer days
Read More
Arrow Right

Senior Performance Engineer

Be part of a team where your work takes center stage, shaping the future of soft...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
jfrog.com Logo
JFrog
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years in software quality, test automation, and performance engineering
  • Strong coding skills in Python or Java and proficiency in Bash
  • Hands-on expertise with Kubernetes and a deep understanding of the Linux operating system
  • A solid background in performance testing tools such as JMeter, k6, Locust, Gatling, and Blazemeter
  • A deep understanding of distributed systems, networking principles, and database architecture
  • Excellent written and verbal communication skills, with a specific ability to translate raw performance numbers into narratives that articulate customer value and inform executive strategy
  • Full Professional English Proficiency is paramount for this position
Job Responsibility
Job Responsibility
  • Lead Performance Engineering Initiatives: Design and execute comprehensive performance tests (Load, Stress, etc.) using tools like Blazemeter,Gatling, Locust, and k6
  • Troubleshoot and Optimize: Investigate performance issues and bugs by analyzing system bottlenecks with observability tools and logs, then collaborate with engineering teams to address them
  • Automate and Integrate: Build and maintain performance test infrastructure, integrating all automation and tests into our CI/CD pipelines using GitHub Actions, Jenkins, or equivalent tools
  • Enhance Observability: Build observability into the entire testing lifecycle using tools like Grafana, Prometheus, and Datadog to create performance reports and establish clear benchmarks
  • Communicate and Evangelize: Publish test results, architectural recommendations, and best practices through internal reports, presentations, external blogs, and official documentation
Read More
Arrow Right

Senior Platform/DevOps Engineer

Koddi is looking for a Senior Platform/DevOps engineer focused on delivery. You'...
Location
Location
United States , Fort Worth, Texas
Salary
Salary:
Not provided
koddi.com Logo
Koddi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years experience in a DevOps/Platform role
  • Strong experience with cloud technologies: cloud computing (EC2, VMs, etc.), cloud storage (S3, BLOB, etc.), container services (ECS, etc.), Kubernetes services (EKS, etc.), IAM, VPCs
  • Exceptional communicator
  • Proven habit of turning ambiguous work into milestone plans
  • Experience running daily stand-ups or async status
  • Coaching/coordination without authority
  • Understanding of system administration in Linux (and possibly Windows) environments
  • Proficiency with monitoring and observability tools (e.g., Datadog, PagerDuty, CloudWatch)
  • Proficiency with Bash and Python
  • Proficiency with infrastructure-as-code (e.g., Terraform, Cloudformation)
Job Responsibility
Job Responsibility
  • Design, implement, and maintain scalable, secure, and reliable cloud infrastructure for our SaaS platform
  • Create and maintain daily/weekly milestones with partners
  • drive progress and surface risks with concise written updates
  • Run lightweight standups or async check-ins
  • track status in Jira with clear acceptance criteria
  • Collaborate with software engineering teams to ensure smooth deployment and operation of services
  • Maintain critical applications on cloud-native microservices architecture
  • Implement automation, effective monitoring, and infrastructure-as-code
  • Manage our continuous integration and delivery pipeline to maximize efficiency
  • Iterate on best practices to increase the quality and velocity of deployments
Read More
Arrow Right

Senior Logging & Detection Engineer

We are currently seeking a Senior Logging & Detection Engineer to lead the techn...
Location
Location
Canada , Vancouver; Calgary; Toronto
Salary
Salary:
146200.00 - 197800.00 CAD / Year
clio.com Logo
Clio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Senior-level expertise building and scaling enterprise-grade detection capabilities and security monitoring systems
  • Expert-level query language proficiency in at least two of the following: Elasticsearch/Lucene, SQL, KQL (Kusto), or SPL (Splunk), demonstrating advanced optimization techniques
  • Extensive Detection Engineering experience owning the full lifecycle of rules, alerts, and automated response workflows within a SIEM/SOAR environment
  • Advanced log analysis skills across diverse, large-scale data sources, including multi-cloud logs (AWS, Azure, GCP), network flows, and advanced security tool outputs
  • Deep dashboard and visualization expertise with tools like Kibana, Grafana, or Tableau, specifically for security metrics and executive reporting
  • Proven expertise in leading threat hunting efforts using log data to proactively identify and track sophisticated threats and anomalous behavior across the environment
  • Senior-level scripting and automation abilities (Python/Go/PowerShell), used to build custom tools, manage APIs, and drive detection automation at scale
  • Architectural experience integrating and optimizing SIEM platforms, SOAR tools, and security orchestration systems
  • Expert performance optimization skills covering query tuning, index design, data partitioning, and overall resource-efficient analytics on big data
  • Significant incident response experience providing expert-level technical analysis and forensic support during major security incidents
Job Responsibility
Job Responsibility
  • Lead the design and implementation of sophisticated, production-ready detection rules and queries across the ELK stack, security data lakes, and multi-cloud logging platforms
  • Architect and optimize complex search queries, aggregations, and analytics dashboards for high-velocity security monitoring, focusing on performance and cost efficiency
  • Design and build automated detection and response workflows (SOAR), ensuring seamless and reliable integration with critical incident response systems
  • Serve as the primary liaison with the threat intelligence team, developing and owning the framework to translate intelligence into scalable, actionable detection capabilities (e.g., MITRE ATT&CK coverage)
  • Establish and maintain a robust detection rule library, query templates, and lead the creation of security analytics playbooks for the wider team
  • Drive performance optimization and resource utilization strategies across petabyte-scale log datasets, including index design and data tiering
  • Develop and standardize custom visualizations, dashboards, and executive reporting capabilities for security stakeholders
  • Lead complex threat hunting operations, mentor junior team members on investigative techniques, and proactively refine detection logic to achieve near-zero false positive rates
  • Collaborate closely with the platform team to define the logging architecture roadmap based on future detection requirements and security observability goals
  • Proactively research emerging threats and attack patterns, translating novel techniques into strategic, forward-looking detection logic and advising security leadership
What we offer
What we offer
  • Top-tier health benefits, dental, and vision insurance
  • Hybrid work environment
  • Flexible time off policy, with an encouraged 20 days off per year
  • $2000 annual counseling benefit
  • RRSP matching and RESP contribution
  • Clioversary recognition program with special acknowledgement at 3, 5, 7, and 10 years
  • Fulltime
Read More
Arrow Right

Senior Staff Software Engineer, Cloud Proxy

We are seeking a Senior Staff Engineer in Temporal's Cloud Global Services team ...
Location
Location
United States
Salary
Salary:
230000.00 - 290000.00 USD / Year
temporal.io Logo
Temporal
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience architecting and delivering high-availability, security-critical networking or proxy systems
  • Deep understanding of authentication/authorization patterns (OIDC-OpenID Connect on top of OAuth), mTLS, JWT-JASON Web Token, custom identity integrations)
  • Expertise in data encryption at rest and in transit, including envelope encryption and key management
  • Strong proficiency in Go or a comparable systems programming language
  • Familiarity with distributed systems, RPC frameworks (gRPC), and cloud networking patterns
  • Track record of leading complex, multi-team technical initiatives to successful delivery
  • Ability to navigate ambiguity, define vision, and create alignment
  • Experience influencing technical direction across organizational boundaries
Job Responsibility
Job Responsibility
  • Define and drive the architecture for a unified, pluggable proxy framework
  • Establish technical standards for authentication, authorization, encryption, and observability across proxy implementations
  • Evaluate and integrate existing customer-built, S2S, and Cloud Auth proxies into a single supported solution
  • Translate high-level business and security requirements into technical designs
  • Ensure proxy meets Tier 0 workload reliability, security, and performance standards
  • Partner with Product, Security, and Customer Success to align roadmap with customer needs
  • Work closely with Infra Foundations, Security, OSS Server, and CGS teams
  • Engage directly with strategic customers to understand and incorporate their requirements
  • Mentor other engineers on distributed systems architecture, networking, and security
  • Drive the open-source development model, ensuring code quality, documentation, and extensibility
What we offer
What we offer
  • Unlimited PTO, 12 Holidays + 2 Floating Holidays
  • 100% Premiums Coverage for Medical, Dental, and Vision
  • AD&D, LT & ST Disability, and Life Insurance (Standard & Supplemental Available)
  • Empower 401K Plan
  • Additional Perks for Learning & Development, Lifestyle Spending, In-Home Office Setup, Professional Memberships, WFH Meals, Internet Stipend and more
  • $3,600 / Year Work from Home Meals
  • $1,500 / Year Career Development & Learning
  • $1,200 / Year Lifestyle Spending Account
  • $1,000 / Year In-Home Office Setup (In addition to Temporal issued equipment)
  • $500 / Year Professional Memberships
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - UltraGrid

This is an excellent opportunity for god-tier engineers to join a very experienc...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
hypervolt.co.uk Logo
Hypervolt Limited
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of professional software development experience, with a strong focus on architecting, optimizing, and delivering performance-critical systems
  • Deep expertise in Java, Scala and the JDK, leveraging the Java ecosystem for high-performance applications. Proficiency in Rust is a bonus
  • Experience with NixOS is considered a huge plus
  • Proven ability to diagnose, profile, and optimize complex systems using advanced performance analysis tools and methodologies
  • Demonstrated experience in tuning multi-threaded and parallel computing environments, managing concurrency, and applying lock-free designs for efficient resource utilization
  • Familiarity with performance engineering technologies and low-cost always on profiling, metrics and observability
  • Extensive understanding of foundational computer science principles, data structures, and algorithms
  • Extensive understanding of networking and fundamental building blocks of the Internet
  • Firm grasp of distributed consensus algorithms and their practical applications in building scalable, reliable systems
  • Exceptional analytical skills to identify and resolve intricate performance bottlenecks in production-level systems
Job Responsibility
Job Responsibility
  • Working on streaming, networking, storage, and other facets of the system, with an extreme focus on cost and performance
What we offer
What we offer
  • Competitive Compensation
  • Stock options
  • Comprehensive Coverage: Health, dental, and vision plans, plus wellness and mental health support
  • Work-Life Flexibility
  • Additional Perks: We'll buy you a laptop. Whichever one you want
  • Innovative Environment: A culture like no other. Work with peers and people who truly value exceptionally good software
Read More
Arrow Right