CrawlJobs Logo

Senior Infrastructure & Application Observability Architect

sita.aero Logo

SITA

Location Icon

Location:
Australia , Sydney

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Requirements:

  • At least 5 years of experience in infra and app observability
  • In depth knowledge of networking protocols
  • Industry certifications in network related technologies
  • Experience configuration of and maintenance of Switches – Cisco Nexus
  • Experience configuration of and maintenance of Firewalls – Palo Alto, Fortigate, VMware NSX (or similar)
  • Experience configuration of and maintenance of F5 LTM and WAF (or similar)
  • Comfortable with Linux CLI commands
  • University degree or equivalent preferably in Computer Science Engineering Mathematics or similar
  • Where applicable a recognized professional qualification is desirable
What we offer:
  • Flex Week: Work from home up to 2 days/week (depending on your team's needs)
  • Flex Day: Make your workday suit your life and plans
  • Flex-Location: Take up to 30 days a year to work from any location in the world
  • Employee Wellbeing: Employee Assistance Program (EAP), for you and your dependents 24/7, 365 days/year
  • Champion Health - a personalized platform that supports a range of wellbeing needs
  • Professional Development: Access to world-class platforms and programs
  • Competitive Benefits: Competitive benefits that make sense with both your local market and employment status

Additional Information:

Job Posted:
March 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Infrastructure & Application Observability Architect

Senior Frontend Infrastructure Engineer

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
Israel , Ramat Gan
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in frontend development with one of the major frameworks
  • Strong expertise in architecting frontend infrastructure
  • Proficiency with tools and workflows like GitHub Actions, lint tools etc.
  • Experience with monorepo workspaces (e.g Lerna, Nx)
  • Experience with optimizing build times and creating efficient workflows
  • Strong understanding of HTML5, CSS3, SCSS, JavaScript (ES6+), and TypeScript
  • Experience with RESTful APIs and WebSockets
  • Strong understanding of UI/UX best practices
  • Experience with unit testing and frontend testing frameworks (e.g., Jest, Playwright)
  • Strong problem-solving skills and attention to detail
Job Responsibility
Job Responsibility
  • Design and maintain scalable web applications
  • Implement and optimize design systems
  • Improve build processes for efficiency
  • Create automated workflows
  • Write clean and testable code
  • Collaborate with cross-functional teams to deliver features
  • Debug and upgrade software
  • Stay current with emerging frontend technologies
  • Fulltime
Read More
Arrow Right

FLEX Senior Solutions Architect

Accountable for the research, analysis, design, creation and implementation of P...
Location
Location
United States , Bethesda
Salary
Salary:
83.17 - 101.11 USD / Hour
https://www.marriott.com Logo
Marriott Bonvoy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in an IT operational role supporting mission critical solutions or applications with 5+ years leading an infrastructure organization
  • Bachelor's Degree in IT-related field with five (5)+ years of equivalent combination of education and experience and training
  • 3+ years of experience providing operations and sustainment support for cloud infrastructure service on Amazon or Azure or Ali cloud
  • 5+ years’ experience in any of the following: Public Clouds/Virtual Deployment using ESXi, Amazon Web Services (AWS) / EC2/EKS, Microsoft Azure, Oracle Cloud, Ali cloud, SaaS
  • Graduate degree in technical discipline
  • Strong diagnostic skills with regards to identification and classification of malicious BOT traffic
  • SaFe agile delivery framework
  • Experience supporting modern operating models (Site Reliability engineering)
  • Experience in System Engineering of servers, storage, network, etc.
  • Familiarity with large scale cloud infrastructure, including network architectures, routing, DNS, TCP/IP protocols, and SSL/TLS ciphers
Job Responsibility
Job Responsibility
  • Provides leadership, oversight, governance, and strategic direction related to Infrastructure services to enable the delivery of IT services
  • Defines the Marriott infrastructure architecture and governance model
  • Provides technical leadership, oversight, standardization, and validation of the effectiveness for the Enterprise Infrastructure environment
  • Research, designs, and implements high-performing software components that are standards-based, highly available and secured, delivering the required business functionality
  • Educates internal and external users of the technologies to continually improve the knowledge and skill-base of the organization on how best to operate and support the infrastructure services
  • Develops documents with a focus on how services will be leveraged in the solution architecture
  • Participates in the evaluation and selection of Infrastructure based products
  • Work closely with the EA team to facilitate alignment of plans with what is being delivered
  • Institutes governance based on best practices and ensure proper alignment to projects and major initiatives
  • Leads the analysis of the current environment to detect critical deficiencies and recommends solutions for improvement
What we offer
What we offer
  • bonus program
  • comprehensive health care benefits
  • 401(k) plan with up to 5% company match
  • employee stock purchase plan at 15% discount
  • accrued paid time off
  • life insurance
  • group disability insurance
  • travel discounts
  • adoption assistance
  • paid parental leave
  • Fulltime
Read More
Arrow Right

Applications Development Senior Programmer Analyst

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 8 - 12 years of experience
  • Strong hands-on experience in coding (Java, Python, or any modern programming language)
  • Deep expertise in system design and microservices architecture
  • Experience with trunk-based development, feature flags, and progressive delivery strategies
  • Proficiency in TDD, BDD, and automation-first mindset to ensure high test coverage and reliability
  • Strong understanding of CI/CD pipelines, and DevOps practices
  • Experience conducting code reviews, vulnerability assessments, and secure coding
  • Familiarity with modern cloud-native technologies (AWS, Kubernetes, Docker)
  • Excellent problem-solving skills and ability to work in fast-paced, agile environments
  • Strong communication and collaboration skills
Job Responsibility
Job Responsibility
  • Design, develop, and maintain robust, scalable, and high-performance applications
  • Implement trunk-based development practices to enable continuous integration and rapid delivery
  • Develop clean, maintainable, and testable code following SOLID principles and software design best practices
  • Ensure high levels of unit test coverage, test-driven development (TDD), and behavior-driven development (BDD)
  • Actively contribute to hands-on coding, code reviews, and refactoring to maintain high engineering standards
  • Drive the adoption of modern engineering ways of working, including Agile, DevOps, and CI/CD
  • Advocate for automated testing, infrastructure as code, and continuous monitoring to enhance software reliability
  • Apply Behavior-Driven Development (BDD), Test-Driven Development (TDD), and unit testing to ensure code quality and functionality
  • Conduct thorough code reviews, ensuring adherence to best practices in readability, performance, and security
  • Implement and enforce secure coding practices, performing vulnerability assessments and ensuring compliance with security standards
  • Fulltime
Read More
Arrow Right

Lead Observability Engineer

Lead Observability Engineer role focusing on the Elastic Observability Platform,...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
blueyonder.com Logo
Blue Yonder
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent experience
  • 7–10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations
  • Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM)
  • Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns
  • Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365
  • Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow)
  • Advanced scripting/automation experience (Python, PowerShell, Bash)
  • Strong knowledge of API integrations, data pipelines, and log-flow engineering
  • Experience leading incident diagnostics and delivering visibility for RCA and operational improvement
  • Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset
Job Responsibility
Job Responsibility
  • Receives work assignments through the ticketing system or from senior leadership
  • Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments
  • Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform
  • Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting—including governance, standardization, and performance optimization
  • Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks
  • Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability
  • Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365
  • Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams
  • Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements
  • Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis
  • Fulltime
Read More
Arrow Right

Oracle OCI DBA Senior System Analyst

Sopra Steria is seeking an Oracle OCI DBA Senior System Analyst with 6-8 years o...
Location
Location
India , Noida
Salary
Salary:
Not provided
https://www.soprasteria.com Logo
Sopra Steria
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Database (DBCS, ATP/ADW and Exadata)
  • Weblogic
  • Oracle Enterprise Manager
  • Oracle Cloud Infrastructure (IAAS/PAAS, FAAS)
  • Certification in OCI (Preferably Architect or Professional)
  • Oracle Cloud observability and Monitoring
  • Oracle cloud WAF
  • Oracle Security Services (Cloud Guard)
  • Good Knowledge and Certification around AWS/AZURE will be an added advantage
  • Oracle Goldengate
What we offer
What we offer
  • Inclusive and respectful work environment
  • Opportunities for creativity
  • Commitment to fighting discrimination
  • Fulltime
Read More
Arrow Right

Senior+ Software Engineer - Cloud Availability Platform Engineering (Observability)

We are looking for a highly skilled engineer with deep expertise in building and...
Location
Location
United States , San Francisco
Salary
Salary:
166000.00 - 201000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in infrastructure or platform engineering, with a focus on observability and monitoring systems
  • Deep expertise with metrics systems (Prometheus, Thanos, Mimir, Cortex), logging pipelines (Fluent Bit, Vector, Loki, ELK/Opensearch), and tracing platforms (Jaeger, Tempo, OpenTelemetry)
  • Strong programming skills in Go or Python for automation, operators, and custom integrations
  • Experience running observability platforms on Kubernetes and operating them at scale across multi-datacenter environments
  • Proven ability to design, optimize, and scale telemetry pipelines handling high cardinality and high throughput data
  • Solid understanding of distributed systems, performance engineering, and debugging complex workloads
  • Strong collaboration skills and the ability to influence engineering teams to adopt observability best practices
Job Responsibility
Job Responsibility
  • Designing and operating scalable observability systems (metrics, logging, tracing) across multi-datacenter Kubernetes environments
  • Architecting end-to-end telemetry pipelines, including ingestion, storage, querying, and visualization
  • Extending monitoring and alerting with Prometheus, Alertmanager, Thanos/Cortex, Grafana, and OpenTelemetry
  • Building scalable log collection and processing pipelines with Fluent Bit, Vector, Loki, or ELK/Opensearch stacks
  • Implementing distributed tracing platforms (Tempo, Jaeger, OpenTelemetry) and integrating with service meshes, load balancers, and APIs
  • Defining and driving adoption of SLOs, SLIs, and error budgets across services and teams
  • Automating provisioning and scaling of observability infrastructure with Kubernetes, Terraform, and custom tooling (Go, Python)
  • Ensuring reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure)
  • Embedding security best practices into observability platforms, including RBAC, TLS, secret management, and multi-tenant access controls
  • Partnering with engineering teams to embed observability into applications, services, and infrastructure
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Search

Truveta is the world’s first health provider led data platform with a vision of ...
Location
Location
United States , Seattle
Salary
Salary:
155000.00 - 190000.00 USD / Year
truveta.com Logo
Truveta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, Computer Engineering, Information Systems, or a related field (advanced degree a plus)
  • 5+ years of professional software engineering experience
  • Designing, building, and operating distributed systems at scale
  • Writing production-quality, efficient, multi-threaded code that runs reliably in cloud environments
  • Architecting and implementing search system features (indexing, querying, optimization), including building robust test frameworks
  • Reviewing data specifications and handling large-scale data storage and distribution using specialized protocols
  • Debugging and resolving complex production issues in distributed systems
  • Proven experience with cloud-native architectures and DevOps practices (preferably Azure, though AWS/GCP experience is relevant)
Job Responsibility
Job Responsibility
  • Design, build, and maintain index, query, and search system features utilized to aggregate and analyze health data
  • Architecting, implementing, and testing new index and query features
  • Optimizing end-to-end index performance
  • Planning, architecting, and deploying highly scalable and highly reliable search systems
  • Implement relevant compliance controls and conduct thorough security reviews
  • Drive observability, reliability, and automation across the infrastructure and platform
  • Monitor emerging technology in the search and infrastructure domains, evaluate applicability, and champion adoption where appropriate
  • Contribute to knowledge sharing and best practices within the team
What we offer
What we offer
  • Comprehensive benefits with strong medical, dental and vision insurance plans
  • 401K plan
  • Professional development & training opportunities for continuous learning
  • Work/life autonomy via flexible work hours and flexible paid time off
  • Generous parental leave
  • Regular team activities (virtual and in-person)
  • Additional compensation such as incentive pay and stock options
  • Fulltime
Read More
Arrow Right

Principal Software Engineer, Trusted Data Platform

As a Principal Software Engineer, you will be a technical leader and hands-on co...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related technical field
  • 10+ years of experience in backend software development, focusing on distributed systems and storage solutions
  • 5+ years of experience working with AWS storage services (S3, DynamoDB, EBS, EFS, FSx, Glacier)
  • Strong expertise in system design, architecture, and scalability for large-scale storage solutions
  • Proficiency in at least one major backend programming language (Kotlin, Java, Go, Rust, or Python)
  • Experience designing and implementing highly available, fault-tolerant, and cost-efficient storage architectures
  • Deep understanding of distributed systems, replication strategies, sharding, and caching
  • Knowledge of data security, encryption best practices, and compliance requirements (SOC2, GDPR, HIPAA)
  • Experience leading engineering teams, mentoring senior engineers, and driving technical roadmaps
  • Proficiency with observability tools, performance monitoring, and troubleshooting at scale
Job Responsibility
Job Responsibility
  • Designing and optimizing high-scale, distributed storage systems built on AWS storage technologies
  • Shaping the architecture, performance, and reliability of backend storage solutions that power critical applications at scale
  • Designing, implementing, and optimizing backend storage services that support high throughput, low latency, and fault tolerance
  • Working closely with senior engineers, architects, and cross-functional teams to drive scalability, availability, and efficiency improvements in large-scale storage solutions
  • Leading technical deep dives, architecture reviews, and root cause analyses to resolve complex production issues related to storage performance, consistency, and durability
  • Driving best practices in distributed system design, security, and cloud cost optimization
  • Mentoring senior engineers, contributing to technical roadmaps, and helping shape the long-term storage strategy
  • Collaborating with Site Reliability Engineers (SREs) to implement observability, monitoring, and disaster recovery strategies, ensuring high availability and compliance with industry standards
  • Advocating for automation, Infrastructure-as-Code (IaC), and DevOps best practices, leveraging tools like Terraform, AWS CloudFormation, Kubernetes (EKS), and CI/CD pipelines to enable scalable deployments and operational excellence
What we offer
What we offer
  • Atlassians can choose where they work – whether in an office, from home, or a combination of the two
  • Atlassians have more control over supporting their family, personal goals, and other priorities
  • We can hire people in any country where we have a legal entity
  • Interviews and onboarding are conducted virtually
  • Whatever your preference - working from home, an office, or in between - you can choose the place that's best for your work and your lifestyle
Read More
Arrow Right