This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
A senior-level position responsible for accomplishing results by designing, implementing, and managing the firm's engineering platforms, with a focus on CI/CD, container orchestration, and observability. The overall objective of this role is to drive the automation and streamlining of our operations and processes, building and maintaining tools for deployment, monitoring, and operations, and troubleshooting and resolving issues in our dev, test, and production environments.
Job Responsibility
Design, build, and maintain the CI/CD infrastructure and tools, with a focus on Tekton and Harness
Manage, scale, and secure OpenShift container platforms, ensuring high availability and reliability
Develop and manage infrastructure as code (IaC) to automate provisioning and configuration of environments
Implement and manage a comprehensive observability stack using tools like Prometheus, Grafana, and others to monitor system health, performance, and reliability
Collaborate with development teams to create a seamless developer experience and ensure applications are built with scalability, reliability, and security in mind
Utilize in-depth knowledge and skills across multiple infrastructure and development areas to provide technical oversight for the platform
Contribute to the formulation of strategies for platform engineering and DevOps functional areas
Provide evaluative judgment based on the analysis of factual data in complicated and unique situations, including root cause analysis and problem resolution
Impact the DevOps and Platform Engineering area through monitoring delivery of end results and ensuring essential procedures are followed and contribute to defining standards
Appropriately assess risk when technical decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients, and assets, by driving compliance with applicable laws, rules, and regulations, adhering to Policy, and applying sound ethical judgment
Requirements
8+ years of relevant experience in DevOps, Site Reliability Engineering (SRE), or Platform Engineering
Hands-on working experience with container orchestration using OpenShift and Kubernetes
Strong, demonstrable experience with CI/CD tools, specifically Tekton and Harness
Extensive experience with observability and monitoring stacks, including Prometheus and Grafana
Proficiency in Infrastructure as Code (IaC) and configuration management tools
Experience with scripting and automation
Ability to work proactively and independently to address project requirements, and articulate issues/challenges with enough lead time to mitigate project delivery risks
A history of conducting code reviews and ensuring high standards for infrastructure and automation code
Basic knowledge of industry practices and standards in the DevOps and SRE space
Consistently demonstrates clear and concise written and verbal communication
Bachelor's degree/University degree or equivalent experience