This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
This is a Non Production Management Technical Lead position L2 SRE position in North America DevOps, supporting Global Consumer Group applications. GCG Production Management is in the midst of transformation, expanding the support model to incorporate Service Reliability Engineering principles. In support of this transformation, this role is a blend of traditional ITIL based Production Management, with Service Reliability Engineering. The ideal candidate for this position will have experience and broad knowledge of North America Consumer applications along with an interest in learning new technologies, including the use of automation and artificial intelligence technologies to avoid system problems, automate manual activities, and drive improved system & application service levels. The work is supported by contractors offshore and onshore, who provide 7x24 service for North America. This is a technical leadership position, requiring strong organizational and communication skills in addition to analytical and troubleshooting talent. Partnership with Development Teams, Technology teams in CTI, and other Production Management teams is a critical component of this position and required daily.
Job Responsibility:
Provides expertise related to various Distributed Consumer Applications across multiple Lines of Business in North America
Primary point of contact LOB assigned domain
Enable Production management processes in non production environment to provide environment stability
Execute robust service readiness
Facilitate standard toolset adoption for all services in the domain
Works as a L2 expert to support the Incident Management, Problem management, risk management and Change management , CI/CD enablement pipeline for SRE function identified
Has Overall accountability of non production stability for his area/domain
Partners with Level 3 support teams to improve resolution rates, efficiency targets, and organizational Service Level Agreements
Performs SRE analysis and remediates identifies issues with the stakeholders and hold them accountable during release signoffs
Partners with SRE enablement and works as SRE eventually to identify the key areas and provides the SRE recommendation from UAT to PERF and PROD for key business transactions supported
Identifies and leads the implementation of Service Automation to reduce cost, reduce risk, improve efficiency and enable Service Management to keep up with the ever-increasing volume of with fast pace of newer technologies
Continually evolve the working practices within and services provided by Production Management to improve efficiency and productivity
Ability to conduct blameless problem management/post-mortem phase of major incidents, develop executive briefings, assess major incident impacts and drive service improvements to prevent repeat of an incident
Create PMR for P1/P2 incidents and close on the actions
Identify the risks, classify them in the non production estate and work with the peers , team members , create Service Improvement plans and drive them to closure
Create Operational readiness documents for major initiatives and provide handover to production team in a seamless manner
Work with SRE team to create a proactive analysis of UAT and PERF view before handing over to production management
Accountable for end to end service health of NAM Core space
Overall accountable for patching , changes, Infra changes, certificates and other KTLO activities in his domain assigned
Overall accountability of the monitoring and its usage by its stakeholders
Work with the monitoring team for setup and overall accountability
Represent DevOps team in various digital forums and facilitate generate of reports and presentations
Be proficient in various technologies of OSE, Apigee, AWS and other new age technologies
Adopt automation laid down by Production management automation and AIOps
Support and Achieve successful internal audits
Requirements:
8+ years development or production support experience with North America Consumer applications
Solid ITIL Foundation understanding
Engineering Background in system admin, development, DevOps or equivalent field, preferably with experience in Distributed Consumer applications
Experience/ familiarity with automation technologies, advanced analytics and predictive modelling
Experience with databases i.e. Oracle, DB2
Experience in programming in one of the following languages unix shell scripting, Java, etc.
Competent with cloud concepts i.e. API, web services and microservices
Strong analytical, algorithmic, and problem-solving skills
Fluent English
Strong analytical skills, strong problem-solving skills and ability to logically break down tasks into smaller manageable parts
Solid understanding of systems and application design
Systematic problem-solving approach
Strong communication skills and sense of ownership and drive
Adaptable and can work with large complex and multi team owned services
Extremely organized, detailed oriented and thorough in every aspect
Able to balance multiple tasks and projects effectively while adapting to new variables
Utilizing creative and innovative thinking but also adhering to a strong sense of ownership, customer service and integrity demonstrated through clear communication
Drive, self-motivated and eager to learn
Bachelor’s/University degree or equivalent experience
Nice to have:
Experience or familiarity Cloud Technology is a plus
Certification in Site Reliability Engineer, Sales Force or Cloud Based Certification like AWS or Google Cloud is a plus
What we offer:
medical, dental & vision coverage
401(k)
life, accident, and disability insurance
wellness programs
paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays
discretionary and formulaic incentive and retention awards