CrawlJobs Logo

Sr Engineers, Systems Reliability

United States, Frisco 156998.00 - 165000.00 USD / Year · Job Posted March 14, 2026
Apply Position
Job Link Share

Job Description

At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees get the same big love we give our customers. All team members receive a competitive base salary and compensation package - this is Total Rewards. Employees enjoy multiple wealth-building opportunities through our annual stock grant, employee stock purchase plan, 401(k), and access to free, year-round money coaches. That’s how we’re UNSTOPPABLE for our employees! T-Mobile is America’s supercharged Un-carrier, delivering an advanced 4G LTE and transformative nationwide 5G network that will offer reliable connectivity for all. Sr Engineers, Systems Reliability is located in Frisco, TX and will utilize proficient knowledge and skill in emerging DevOps-centric automation tools and technologies for CICD, configuration management, etc. for production environments.

Job Responsibility

  • Perform environment management, automated server provisioning, pipeline configuration (VMs)
  • Deliver software to improve the availability, scalability, latency, and efficiency of T-Mobile’s services
  • Craft, manage, and use dashboard for continuous monitoring and health check of applications, and the underlying infrastructure, improve the quality of services using the monitoring feedback for production environment
  • Contribute to future improvement of software delivery processes and operations, e.g., cloud enablement, use of microservices with containerization
  • Relationship and People Management: Mentors/guides other Systems Reliability Engineers, Software Engineers and vendor resources as needed

Requirements

  • Master’s degree in Computer and information technology, Electrical and Computer Engineering, or related, and 6 years of relevant work experience
  • Bachelor’s degree in Computer and information technology, Electrical and Communication Engineering, or related, and 8 years of relevant work experience
  • Design, develop, and deliver complex GitLab CI/CD pipelines for enterprise billing platforms
  • Build and administer Kubernetes clusters using Conductor for application lifecycle management, packaging with helm and duck templates for infrastructure automation
  • Develop custom tools in Shell, Perl, YAML, Jython and Python (including Boto3) to support zero-downtime deployments and operations
  • Implement Infrastructure as Code with Terraform and AWS CloudFormation to provision infrastructure across AWS, PCF, Google and Azure cloud platforms
  • Develop AWS Lambda function to migrate historical billing information from RDS to S3
  • Support and administer Skava-based ecommerce platforms, Java/J2EE and REST API’s including deployment, scaling, and operational troubleshooting in production
  • Provision and manage relational and NoSQL databases, including PostgreSQL, MySQL, Oracle, and MongoDB (Atlas) and develop, optimize SQL scripts for billing workflows and for generating monthly consumer and business reports
  • Develop scripts and controls to enforce access management using Azure AD and prevent public exposure of secrets using GitGuardian, T-Vault and CyberArk ensuring compliance with cybersecurity standards
  • Automate Windows system administration and deployment processes using PowerShell, create and maintain Power BI reports and dashboards
  • Expert-level experience in implementing and managing observability platforms like Splunk, AppDynamics, and Grafana, with a focus on developing real-time dashboards and actionable alerts for microservice health, API latency, and system fault detection
  • At least 18 years of age
  • Legally authorized to work in the United States

What we offer

  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Annual bonus or periodic sales incentive or bonus based on role
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off and up to 12 paid holidays
  • Paid parental and family leave
  • Family building benefits
  • Back-up care
  • Enhanced family support
  • Childcare subsidy
  • Tuition assistance
  • College coaching
  • Short- and long-term disability
  • Voluntary AD&D coverage
  • Voluntary accident coverage
  • Voluntary life insurance
  • Voluntary disability insurance
  • Voluntary long-term care insurance
  • Mobile service & home internet discounts
  • Pet insurance
  • Access to commuter and transit programs

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Sr Engineers, Systems Reliability

8 matching positions

Sr Engineers, Systems Reliability

At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees...
Location
Location
United States , Atlanta
Salary
Salary:
152131.00 - 165000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree or foreign equivalent in Computer Science, Electronics Engineering, Computer Engineering, Electric Engineering, or related, and 5 years of relevant work experience
  • Master’s degree or foreign equivalent in Computer Science, Electronics Engineering, Computer Engineering, Electric Engineering, or related, and 3 years of relevant work experience
  • Maintain and enhance source code repositories: Bitbucket and GitLab, following GitOps principles to implement branching strategies and SonarQube quality checks and FortifyScan security checks
  • C, C#, Java, JavaScript, Perl, Python, Go, or scripting experience in Shell and Perl for CI/CD pipeline scripts creation and automation of repetitive tasks
  • Continuous Integration and Continuous Delivery tools: Jenkins, Cloudbees, and GitLab CI/CD, to design, implement and manage complex CI/CD pipelines for multi-environment deployments, integrating testing, security scans, and automated Helm chart deployments to on-prem Kubernetes clusters
  • Own and execute PROD deployments to Kubernetes using GitLab CI/CD pipelines and helm upgrades, through a structured change request and approval process, ensuring audit compliance and minimizing risk during releases
  • DevOps tools: Ansible, Chef, Puppet, including Docker, and Kubernetes to build CI/CD pipelines and microservices containerization and orchestration
  • APM tools: AppDynamics, Grafana and logging tools: Splunk to build and maintain observability dashboards, implementing effective alerting for real-time monitoring of system health and user impact
  • Working in a cloud environment public or private: Pivotal Cloud Foundry, Kubernetes, AWS, and Microsoft Azure, to host the containerized microservices in non-PROD and PROD environments
  • At least 18 years of age
Job Responsibility
Job Responsibility
  • Perform environment management, automated server provisioning, pipeline configuration (VMs)
  • Deliver software to improve the availability, scalability, latency, and efficiency of T-Mobile’s services
  • Create, manage, and use dashboard for continuous monitoring and health check of applications, and the underlying infrastructure, improve the quality of services using the monitoring feedback for non-production and production environments
  • Contribute in future improvement of software delivery processes and operations, e.g., cloud enablement, use of microservices with containerization
  • Relationship and People Management: Mentors/guides other Systems Reliability Engineers and vendor resources as needed
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Annual bonus or periodic sales incentive or bonus based on role
  • Medical, dental and vision insurance
  • Flexible spending account
  • Paid time off
  • Up to 12 paid holidays
  • Fulltime
Read More
Arrow Right

Sr Systems Reliability Engineer - Legal Technology

The System Reliability Engineer (SRE) guides and mentors other SREs and improves...
Location
Location
United States , Frisco; Overland Park
Salary
Salary:
98500.00 - 177700.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree plus 5 years of related work experience OR Advanced degree with 3 years of related experience
  • 4–7+ years relevant experience (Required)
  • Experience in Agile/DevOps environments (Required)
  • Proficiency in one or more: Java, Python, Go, C/C#, or scripting (Shell/Perl) (Required)
  • Experience with DBMS (Postgres or Oracle) (Required)
  • Experience with CI/CD tools (e.g., Jenkins) and DevOps tools (GitHub/GitLab, Chef/Puppet) (Required)
  • Experience with Docker, Kubernetes (Required)
  • Experience with APM/observability tools (e.g., Splunk, Grafana, AppDynamics) (Required)
  • Experience troubleshooting distributed systems using logs/metrics/traces (Required)
  • DevOps (Required)
Job Responsibility
Job Responsibility
  • Apply DevOps automation for CI/CD, configuration management, and environment management (non-prod and prod)
  • Provision and manage environments
  • configure pipelines and infrastructure (VMs/containers)
  • Improve availability, scalability, latency, and efficiency of services, with emphasis on Legal Technology platforms
  • Own reliability and performance of critical applications (LRS, E-Core, LEEP)
  • Participate in on-call rotation (~1 week every 2 months)
  • respond to alerts/incidents
  • Lead incident response, root cause analysis, and post-incident improvements
  • Build and enhance observability (dashboards, alerts), runbooks, and automation
  • Partner with engineering to design for reliability and eliminate recurring issues in distributed systems
What we offer
What we offer
  • Competitive base salary and compensation package
  • Annual stock grant
  • Employee stock purchase plan
  • 401(k)
  • Access to free, year-round money coaches
  • Medical, dental and vision insurance
  • Flexible spending account
  • Employee stock grants
  • Employee stock purchase plan
  • Paid time off
  • Fulltime
Read More
Arrow Right
New

Information Systems Sr. Manager – Technology Regulatory Compliance Lead

Join Amgen's Mission of Serving Patients. At Amgen, if you feel like you're part...
Location
Location
United States , Holly Springs
Salary
Salary:
135441.55 - 183244.45 USD / Year
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate degree and 2 years of Information Systems/Technology and/or Engineering experience OR Master's degree and 4 years of Information Systems/Technology and/or Engineering experience OR Bachelor's degree and 6 years of Information Systems/Technology and/or Engineering experience OR Associate's degree and 10 years of Information Systems/Technology and/or Engineering experience OR High school diploma / GED and 12 years of Information Systems/Technology and/or Engineering experience
Job Responsibility
Job Responsibility
  • Lead the development and execution of the strategy for computer system validation, regulatory compliance, and sustained inspection readiness for the Technology organization
  • Serve as the Technology Regulatory Inspection Lead, ensuring alignment with global quality and compliance strategies
  • Drive inspection readiness efforts, including playbooks, SME coaching, system documentation, and readiness dashboards
  • Oversee digital platform implementations, ensuring strong planning, resource management, risk mitigation, and financial oversight
  • Manage digital technology services, including validation, compliance, data integrity, and system reliability
  • Design and implement effective support models, including vendor and third-party management with clear accountability
  • Lead cross-functional teams, run key meetings, and present updates to senior leadership
  • Deliver technology solutions that improve efficiency, quality, and business alignment
  • Proactively manage complex, simultaneous projects and resolve challenges effectively
  • Establish performance metrics to drive continuous improvement
What we offer
What we offer
  • Comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program
  • Stock-based long-term incentives
  • Award-winning time-off plans and bi-annual company-wide shutdowns
  • Flexible work models, including remote work arrangements, where possible
  • Fulltime
Read More
Arrow Right

Sr Engineers, Software

At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees...
Location
Location
United States , Frisco
Salary
Salary:
156998.00 - 165000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s degree or foreign equivalent in Computer Science, Computer Programming, Computer Engineering, or related, and 5 years of relevant work experience
  • OR Bachelor’s degree or foreign equivalent in Computer Science, Computer Programming, Computer Engineering, or related, and 7 years of relevant work experience
  • Experience in each of: 1. Performing microservice design, development, deployment, and maintenance of enterprise applications and applying domain knowledge of Telecommunications (OSS/BSS) including Billing Domain, Payment Services, Order Management, Credit Management, Digital Telecommunications Commerce and Product Catalog systems using Java, GoLang: Spring Boot, Spring Cloud, Spring MVC, Spring Data JPA, Spring Security, Spring GraphQL, and implementing scalable RESTful and federated GraphQL APIs using Apollo Federation 1.x/2.x., MuleSoft 3.x/4.x, Kubernetes, Kafka, IntellJ IDE, BitBucket, GitLab, Eclipse, Lucid, Figma, Postman, Splunk, AppDynamics, Oracle, MySQL and Redis
  • 2. Designing and developing microservices using Java and GoLang with deep expertise in the Spring ecosystem: Spring Boot, Spring Cloud, Spring MVC, Spring Data JPA, Spring Security, Spring GraphQL, and implementing scalable RESTful and federated GraphQL APIs using Apollo Federation 1.x/2.x., MuleSoft 3.x/4.x
  • 3. Building and integrating APIs and web services using REST, SOAP, Swagger: OpenAPI, XML, JSON, and asynchronous messaging platforms such as Apache Kafka and RabbitMQ
  • 4. Working with databases and data stores, including SQL: Oracle, MySQL, and PostgreSQL, and NoSQL: MongoDB, Cassandra, and Couchbase, and in-memory caching with Redis
  • 5. Implementing CI/CD pipelines and DevOps practices using Jenkins, GitLab, Docker, Kubernetes (K8s), and applying basic Linux command-line proficiency
  • 6. Utilizing testing, monitoring, and development tools such as JUnit, JMeter, Mockito, Robot, WireMock, Maven, Gradle, AppDynamics, Splunk, Git, Bitbucket, IntelliJ, Eclipse, Velocity Studio, Postman, and SoapUI
  • At least 18 years of age
  • Legally authorized to work in the United States
Job Responsibility
Job Responsibility
  • Work as a Full Stack Developer, managing multiple applications including Metro Web, Backend Spring Boot APIs, MuleSoft APIs (v3.8), and AEM for content management
  • Design and develop RESTful APIs using Spring Boot, leveraging Spring Cloud Eureka for service discovery and Spring Cloud OpenFeign for inter-service communication
  • Build and maintain Spring Boot APIs integrated with Angular applications composed in a Monorepo structure using Hapi plugins
  • Develop robust data access layers using Spring Data JPA
  • Collaborate with cross-functional team members to deliver high-quality solutions
  • Design and implement visually appealing and responsive user interfaces
  • Diagnose and resolve issues in front-end code to enhance performance and eliminate bugs
  • Create and execute unit tests to ensure code reliability and functionality
  • Architect and maintain RESTful API solutions using Spring Boot
  • Demonstrate a strong understanding of common API technologies, including OAuth, SAML, Spring Boot, and Microservices architecture
What we offer
What we offer
  • competitive base salary and compensation package
  • annual stock grant
  • employee stock purchase plan
  • 401(k)
  • free year-round money coaches
  • medical insurance
  • dental insurance
  • vision insurance
  • flexible spending account
  • paid time off
  • Fulltime
Read More
Arrow Right

Sr. Systems Development Engineer

At Ford, you’ll work on ideas that matter, alongside passionate people who want ...
Location
Location
United States , Dearborn
Salary
Salary:
99100.00 - 166200.00 USD / Year
ford.com Logo
Ford Motor Company
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of a Bachelor of Science Degree in Computer/Software/Electrical Engineering or equivalent
  • 5+ years of experience with electrical architecture, electrical systems, and automotive network topologies, with a strong background in Systems Engineering and reliability principles
  • 5+ years of experience in robust requirement development, verification, and validation, with a focus on designing for reliability and testability, and experience in component/vehicle level testing for failure modes
  • Proven experience in systematically analyzing and resolving complex systems issues, employing root cause analysis, corrective action methodologies, and implementing design solutions for failure avoidance
  • In-depth knowledge of CAN, LIN, and Ethernet-based In-Vehicle Network protocols and processes, including understanding of potential failure modes and diagnostic strategies
  • Knowledge of Service Oriented Architecture (SOA) based API design, with an understanding of fault tolerance and error handling in distributed systems
  • Knowledge of Human Factors and Human-Machine Interfaces (HMI), including considerations for usability, error prevention, and graceful degradation in failure scenarios
  • Strong knowledge of Requirements Engineering (INCOSE, EARS, BDD/Gherkin), with an ability to translate reliability and safety requirements into actionable specifications
  • Demonstrated in-depth knowledge and practical application of Design Failure Mode Avoidance (DFMA), Failure Modes and Effects Analysis (FMEA), Fault Tree Analysis (FTA), and other reliability engineering techniques
  • Knowledge of Unified Modeling Language (UML) or System Modeling Language (SysML) for modeling system behavior, structure, and potential failure modes
Job Responsibility
Job Responsibility
  • Interact with feature owners and software development teams to scope and define robust systems interface and software requirements, with a strong emphasis on proactive Design Failure Mode Effects and Analysis (DFMEA)
  • Executing functional partitioning of features, focusing on identifying and isolating potential failure points and ensuring fault tolerance
  • Developing robust and fault-tolerant interfaces between allocable classes, considering potential failure propagation and implementing safeguards
  • Specifying comprehensive system-level and ECU software requirements, explicitly defining expected behavior, error handling, and failure recovery mechanisms based on DFMEA principles
  • Adopting model-based systems engineering (MBSE) methodologies for specification development, leveraging modeling and simulation for early failure mode identification and mitigation
  • Using Requirement Management tools for requirement authoring, change management, and specification generation, ensuring traceability to DFMEA activities and reliability targets
  • Using Behavior Driven Development (BDD) with Gherkin syntax to develop scenarios and associated test cases, incorporating negative test cases and failure scenarios derived from DFMEA analyses
  • Leading and performing comprehensive Design Failure Mode Effects and Analysis (DFMEA) activities throughout the product lifecycle to ensure the thoroughness, robustness, and reliability of software and system designs, proactively identifying and addressing potential failure modes
  • Managing cross-functional working level meetings with Feature/Function Owners, module owners, HMI teams, and the IVI Platform team to develop functional integration strategies that incorporate reliability and failure avoidance considerations
  • Performing rigorous system compatibility analysis and impact assessment for IVI feature deployment, specifically evaluating and mitigating potential failure modes introduced by new architectures, programs, or market variations
What we offer
What we offer
  • Immediate medical, dental, vision and prescription drug coverage
  • Flexible family care days, paid parental leave, new parent ramp-up programs, subsidized back-up child care and more
  • Family building benefits including adoption and surrogacy expense reimbursement, fertility treatments, and more
  • Vehicle discount program for employees and family members and management leases
  • Tuition assistance
  • Established and active employee resource groups
  • Paid time off for individual and team community service
  • A generous schedule of paid holidays, including the week between Christmas and New Year’s Day
  • Paid time off and the option to purchase additional vacation time
  • Fulltime
Read More
Arrow Right

Sr. Systems Engineer

As a System Engineer at 10x, you will be involved in all aspects of the product ...
Location
Location
United States , Pleasanton
Salary
Salary:
168900.00 - 228500.00 USD / Year
10xgenomics.com Logo
10x Genomics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelors in an engineering discipline (Comp Sci, Mechanical, Electrical, etc)
  • masters preferred
  • 5+ years working with the design, integration, or testing of hardware systems or instruments
  • Direct experience in a cross-functional role (e.g., systems engineer, project lead)
  • Experience in system modeling for performance/cost tradeoffs, including prior experience with product cost modeling
  • Proficiency in Python
  • comfort implementing moderate complexity in silico models and associated visualizations (e.g., in Jupyter notebooks)
  • Fluency with modern AI tools (chat platforms and software development tools)
  • Strong analytical, problem-solving, and organizational skills
  • Excellent visual & oral communication and presentation skills
Job Responsibility
Job Responsibility
  • Develop, model, document, and refine platform and instrument architectures
  • Support architecture implementation with subject matter experts and engineers in sub-system teams
  • Develop models that allow for data-driven decisions on products, simulating tradeoffs between performance of instruments and consumables, and product COGS.
  • Conceive and implement system-level tests for new instruments, and work with discrete design teams ensure that sub-system and component level testing is appropriate to verify new designs against performance and reliability requirements
  • Engage with technical team members to identify and resolve key technical challenges
  • Facilitate cross-functional interactions throughout the product development life cycle, including the management of relationships with internal and external stakeholders
  • Collaborate with stakeholders and technical teams to develop end user workflows
  • Conduct routine risk assessments (e.g., FMEAs) and work with leadership to define and execute risk mitigation plans
  • Provide technical leadership throughout the product development process
  • run design reviews and lead technical efforts at project and platform levels
What we offer
What we offer
  • equity grants
  • comprehensive health and retirement benefit programs
  • annual bonus program or sales incentive program
  • competitive and comprehensive health benefits package
  • competitive easy-to-use benefits that promote wellbeing
  • family friendly policies like parental leave
  • generous time off
  • Fulltime
Read More
Arrow Right

Sr Systems Safety Engineer

Archer is an aerospace company based in San Jose, California building an all-ele...
Location
Location
United States , San Jose
Salary
Salary:
144000.00 - 198000.00 USD / Year
archer.com Logo
Archer Aviation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS, MS, or PhD in engineering or a related technical degree
  • 5+ years experience as System Safety Engineer
  • Experience with all types of systems commonly used in particular Flight Controls and Powertrain systems
  • Experience with all phases of aircraft development from concept to entry into service
  • Experience performing all types of safety analyses commonly used in civil and military aviation
  • Experience creating, validating, and verifying aircraft and system-level safety requirements
  • Experience being personally accountable for one or more safety activities on a civil and military aircraft certification program
  • Experience developing and executing test plans in a simulation environment
  • Detailed understanding of civil and military aviation standards, practices, and regulations
Job Responsibility
Job Responsibility
  • Collaborate with various engineering disciplines to design high-reliability, safety-critical systems for a hybrid/electric vertical takeoff and landing (eVTOL) aircraft
  • Take ownership of the safety requirements and safety assessments of one or more vehicle systems, including Flight Controls and Powertrain systems
  • Participate in or perform one or more vehicle-level installation safety or other common cause analyses
  • Participate in the definition of safety requirements and safety assessments at the vehicle level
  • Represent System Safety in multi-disciplinary decision making and regulatory compliance activities under supervision of leadership
  • Contribute to the System Safety processes and work methods
  • Participate in regulatory compliance planning and demonstration activities under supervision of team leadership and the Certification and Test organization
  • Develop and execute test plans in a simulation lab environment
  • Perform Zonal Safety Inspections in the aircraft
  • Fulltime
Read More
Arrow Right

Sr Reliability Engineer

This position is part of the Kingsport Packaging Mill Maintenance Team and plays...
Location
Location
United States , Kingsport
Salary
Salary:
Not provided
domtar.com Logo
Domtar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor`s degree in Mechanical Engineering
  • or degree in a maintenance and/or reliability related field
  • Minimum of three (3) years of hands-on experience with Mechanical/reliability maintenance
  • Analytical, critical thinking, and proven knowledge of reliability systems, methods, processes and problem-solving skills
  • Proficient planning, organization, and time-management skills
Job Responsibility
Job Responsibility
  • Identify and implement process and equipment reliability improvement by working closely with operations and maintenance personnel
  • Maintaining Key reliability processes and workstreams in accordance with the One Domtar Standards
  • Utilize predictive and preventative maintenance program to identify and correct equipment issues
  • Identifies and implements improvements in process and equipment reliability across the Mill by working closely with Operations, Maintenance, and Engineering personnel
  • Solve problems utilizing the mills RCPE program
  • Supports the mills goal for safety, environmental, and other areas of need
  • Supports maintenance team by assisting with resolving technical issues and repair procedures / troubleshooting
What we offer
What we offer
  • competitive compensation
  • a supportive working environment
  • rewarding career paths
  • plenty of opportunities for learning and growth
  • Fulltime
Read More
Arrow Right