CrawlJobs Logo

Kafka Operations Administrator

realign-llc.com Logo

Realign

Location Icon

Location:
United States , Seattle

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

157500.00 USD / Year

Job Responsibility:

  • Deploy, configure and manage Kafka clusters and related services to meet SLA requirement
  • Participate in 24x7 on-call rotation to respond to incidents, alerts, and escalations
  • Triage, diagnose, and remediate production incidents
  • coordinate with stakeholders, developers and infrastructure teams
  • Implement automation for provisioning, scaling, server/data backups, and disaster recovery
  • Maintain monitoring, alerting thresholds, dashboards, and Kafka ecosystem health
  • Harden Kafka deployments: configure TLS, ACLs, RBAC, encryption, and vulnerability remediation
  • Perform routine maintenance: Kafka ecosystem upgrades (controllers, brokers, connect, and schema registry), rolling restarts, etc
  • Create and maintain runbooks, runbook automation, and post-incident reports
  • Optimize performance and resource utilization
  • benchmark and tune clusters
  • Support Kafka Connect/Schema Registry service and troubleshoot connector issues
  • Contribute to CI/CD pipeline improvements for infrastructure and deployment automation

Requirements:

  • Production-grade Apache Kafka operations experience, managing, maintaining and upgrading Kafka clusters in production environments with a focus on high availability, disaster recovery, fail-over and overall reliability
  • Proficiency in installing and configuring monitoring systems using Grafana (building dashboards), Prometheus, Splunk , JMX metrics
  • Automation and orchestration experience: Terraform , Ansible, Helm, Kubernetes (EKS/AKS/GKE)
  • Strong Linux system administration experience, including troubleshooting, automation and scripting for efficient infrastructure management
  • Experience in Production Support (ITIL processes followed) and participating in 24x7 on-call rotations , documenting incidents/postmortems
  • Experience in supporting JVM tuning, GC Analysis, network and disk I/O diagnostics
  • Experience in TCP/IP, routing, switching and firewall configurations relevant to Kafka operations

Nice to have:

  • Deep Kafka performance tuning and capacity planning experience
  • Knowledge of message delivery semantics and guarantees (at-least-once, exactly-once)
  • Cloud-native security/compliance experience (IAM, VPC, KMS, Security Groups)
  • Certifications: Confluent Certified Administrator, AWS/Azure/GCP certifications
  • Experience with Apache Kafka in KRaft mode, including set up, configuration, troubleshooting and cluster management
  • Containerization and Container Orchestration Tools experience: Docker, Kubernetes
  • Experience with CI/CD pipelines and Git-based workflows
  • Experience building custom Kafka connect libraries and understanding of data serialization formats (eg: Avro, JSON)
  • Knowledge of networking concepts across on-prem VMs and cloud environments, ensuring seamless integration and communication between services
  • Strong understanding of topic management and security best practices for streaming platforms: TLS, ACLs, RBAC, encryption at rest/in transit
  • Kafka ecosystem tooling experience: Kafka Connect, Schema Registry

Additional Information:

Job Posted:
March 21, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Kafka Operations Administrator

New

Kafka Operations Administrator

Location
Location
United States , Seattle; St. Louis; TX
Salary
Salary:
157500.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Production-grade Apache Kafka operations experience, managing, maintaining and upgrading Kafka clusters in production environments with a focus on high availability, disaster recovery, fail-over and overall reliability
  • Proficiency in installing and configuring monitoring systems using Grafana (building dashboards), Prometheus, Splunk , JMX metrics
  • Automation and orchestration experience: Terraform , Ansible, Helm, Kubernetes (EKS/AKS/GKE)
  • Strong Linux system administration experience, including troubleshooting, automation and scripting for efficient infrastructure management
  • Experience in Production Support (ITIL processes followed) and participating in 24x7 on-call rotations , documenting incidents/postmortems
  • Experience in supporting JVM tuning, GC Analysis, network and disk I/O diagnostics
  • Experience in TCP/IP, routing, switching and firewall configurations relevant to Kafka operations
Job Responsibility
Job Responsibility
  • Deploy, configure and manage Kafka clusters and related services to meet SLA requirement
  • Participate in 24x7 on-call rotation to respond to incidents, alerts, and escalations
  • Triage, diagnose, and remediate production incidents
  • coordinate with stakeholders, developers and infrastructure teams
  • Implement automation for provisioning, scaling, server/data backups, and disaster recovery
  • Maintain monitoring, alerting thresholds, dashboards, and Kafka ecosystem health
  • Harden Kafka deployments: configure TLS, ACLs, RBAC, encryption, and vulnerability remediation
  • Perform routine maintenance: Kafka ecosystem upgrades (controllers, brokers, connect, and schema registry), rolling restarts, etc
  • Create and maintain runbooks, runbook automation, and post-incident reports
  • Optimize performance and resource utilization
  • Fulltime
Read More
Arrow Right
New

Kafka Operations Administrator

Location
Location
United States , Seattle, WA/St. Louis, Mo / TX
Salary
Salary:
157500.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Production-grade Apache Kafka operations experience, managing, maintaining and upgrading Kafka clusters in production environments with a focus on high availability, disaster recovery, fail-over and overall reliability
  • Proficiency in installing and configuring monitoring systems using Grafana (building dashboards), Prometheus, Splunk , JMX metrics
  • Automation and orchestration experience: Terraform , Ansible, Helm, Kubernetes (EKS/AKS/GKE)
  • Strong Linux system administration experience, including troubleshooting, automation and scripting for efficient infrastructure management
  • Experience in Production Support (ITIL processes followed) and participating in 24x7 on-call rotations , documenting incidents/postmortems
  • Experience in supporting JVM tuning, GC Analysis, network and disk I/O diagnostics
  • Experience in TCP/IP, routing, switching and firewall configurations relevant to Kafka operations
Job Responsibility
Job Responsibility
  • Deploy, configure and manage Kafka clusters and related services to meet SLA requirement
  • Participate in 24x7 on-call rotation to respond to incidents, alerts, and escalations
  • Triage, diagnose, and remediate production incidents
  • coordinate with stakeholders, developers and infrastructure teams
  • Implement automation for provisioning, scaling, server/data backups, and disaster recovery
  • Maintain monitoring, alerting thresholds, dashboards, and Kafka ecosystem health
  • Harden Kafka deployments: configure TLS, ACLs, RBAC, encryption, and vulnerability remediation
  • Perform routine maintenance: Kafka ecosystem upgrades (controllers, brokers, connect, and schema registry), rolling restarts, etc
  • Create and maintain runbooks, runbook automation, and post-incident reports
  • Optimize performance and resource utilization
  • Fulltime
Read More
Arrow Right

DevOps Engineer – Kafka Service

We are looking for a highly skilled DevOps Engineer to take ownership of the Kaf...
Location
Location
Luxembourg , Leudelange
Salary
Salary:
Not provided
https://www.soprasteria.com Logo
Sopra Steria
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in DevOps, Site Reliability Engineering (SRE), or Kafka administration
  • Strong hands-on experience with Apache Kafka (setup, tuning, and troubleshooting)
  • Proficiency in scripting (Python, Bash) and automation tools (Terraform, Ansible)
  • Experience with cloud environments (AWS, Azure, or GCP) and Kubernetes-based Kafka deployments
  • Familiarity with Kafka Connect, KSQL, Schema Registry, Zookeeper
  • Knowledge of logging and monitoring tools (Dynatrace, ELK, Splunk)
  • Understanding of networking, security, and access control for Kafka clusters
  • Experience with CI/CD tools (Jenkins, GitLab, ArgoCD)
  • Ability to analyze logs, debug issues, and propose proactive improvements
  • Excellent problem-solving and communication skills
Job Responsibility
Job Responsibility
  • Kafka Administration & Operations: Deploy, configure, monitor, and maintain Kafka clusters in a high-availability production environment
  • Performance Optimization: Tune Kafka configurations, partitions, replication, and producers/consumers to ensure efficient message streaming
  • Infrastructure as Code (IaC): Automate Kafka infrastructure deployment and management using Terraform, Ansible, or similar tools
  • Monitoring & Incident Management: Implement robust monitoring solutions (e.g., Dynatrace) and troubleshoot performance bottlenecks, latency issues, and failures
  • Security & Compliance: Ensure secure data transmission, access control, and compliance with security best practices (SSL/TLS, RBAC, Kerberos)
  • CI/CD & Automation: Integrate Kafka with CI/CD pipelines and automate deployment processes to improve efficiency and reliability
  • Capacity Planning & Scalability: Analyze workloads and plan for horizontal scaling, resource optimization, and failover strategies
What we offer
What we offer
  • Work among high-level professionals at the forefront of corporate software solutions and innovation at Europe’s Leading Digital Service Provider
  • Fulltime
Read More
Arrow Right

Java Developer-Applications Development Programmer Analyst

The Applications Development Programmer Analyst is an intermediate level positio...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 1-4 years of relevant experience
  • Solid foundational understanding of Java programming, including Object-Oriented Programming (OOP) principles and SOLID principles
  • Familiarity with key features and enhancements in JDK versions 8, 11, 17, and 21
  • Exposure to Apache Kafka, including basic concepts and how to interact with it
  • Ability to write and understand basic to intermediate SQL queries for Oracle databases
  • Experience or understanding of developing and working with microservices architectures
  • Awareness of Continuous Integration/Continuous Delivery principles and tools
  • Proven ability to debug complex issues and perform Root Cause Analysis (RCA)
  • Excellent verbal and written communication skills, with the ability to articulate technical concepts clearly
  • A positive, proactive, and collaborative attitude with a strong eagerness to learn and grow
Job Responsibility
Job Responsibility
  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements
  • Identify and analyze issues, make recommendations, and implement solutions
  • Utilize knowledge of business processes, system processes, and industry standards to solve complex issues
  • Analyze information and make evaluative judgements to recommend solutions and improvements
  • Conduct testing and debugging, utilize script tools, and write basic code for design specifications
  • Assess applicability of similar experiences and evaluate options under circumstances not covered by procedures
  • Develop working knowledge of Citi’s information systems, procedures, standards, client server application development, network operations, database administration, systems administration, data center operations, and PC-based applications
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Applications Development Intermediate Programmer Analyst

The Applications Development Intermediate Programmer Analyst is an intermediate ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5 - 7 years of relevant experience in the Financial Service industry
  • Intermediate level experience in Applications Development role as a Full Stack Developer
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated problem-solving and decision-making skills
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Hands on coding experience in Java and Spring Boot Application development
  • Hands on experience in any javascript frameworks like Angular, React
  • Experience with databases like Oracle, MongoDB etc.
  • Experience with Build tools like Lightspeed, Jenkins etc.
  • Experience with middleware like MQ, KAfka etc.
Job Responsibility
Job Responsibility
  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements
  • Identify and analyze issues, make recommendations, and implement solutions
  • Utilize knowledge of business processes, system processes, and industry standards to solve complex issues
  • Analyze information and make evaluative judgements to recommend solutions and improvements
  • Conduct testing and debugging, utilize script tools, and write basic code for design specifications
  • Assess applicability of similar experiences and evaluate options under circumstances not covered by procedures
  • Develop working knowledge of Citi’s information systems, procedures, standards, client server application development, network operations, database administration, systems administration, data center operations, and PC-based applications
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Automation Test Engineer

The Applications IT Quality Programmer Analyst is an intermediate level position...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2-5 years of relevant experience in the Financial Service industry
  • Intermediate level experience in Applications Development or Testing automation role
  • Hands on Coding experience on Java backend technologies like Core Java, Spring boot, JPA, Hibernate etc. or hands on Experience on test automation tools like Selenium etc.
  • Hands on scripting experience on at least few DB technologies like Oracle, SQL, MySQL
  • Hands on experience on Mongo and Kafka
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated problem-solving and decision-making skills
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Must be able to understand requirements & convert to technical design and code
Job Responsibility
Job Responsibility
  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements
  • Identify and analyze issues, make recommendations, and implement solutions
  • Utilize knowledge of business processes, system processes, and industry standards to solve complex issues
  • Analyze information and make evaluative judgements to recommend solutions and improvements
  • Conduct testing and debugging, utilize script tools, and write basic code for design specifications
  • Assess applicability of similar experiences and evaluate options under circumstances not covered by procedures
  • Develop working knowledge of Citi’s information systems, procedures, standards, client server application development, network operations, database administration, systems administration, data center operations, and PC-based applications
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer III

Zuora’s Cloud Engineering teams are responsible for Cloud infrastructures, monit...
Location
Location
India , Chennai
Salary
Salary:
Not provided
zuora.com Logo
Zuora
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6-8 years of relevant experience on SRE/DevOps
  • Proven hands-on working experience with core AWS services (e.g., EC2, VPC, S3, RDS, IAM, CloudWatch, EKS/ECS)
  • Deep expertise in infrastructure-as-code principles using Terraform for provisioning and state management
  • Expert-level knowledge and practical experience with configuration management tools such as Puppet and/or Ansible
  • Strong experience setting up, maintaining, and enhancing Continuous Integration/Continuous Deployment pipelines using Jenkins
  • Proficiency in scripting languages, particularly Python and/or Shell scripting, for developing automation tools and performing system administration tasks
  • Advanced knowledge of Linux operating systems, including performance tuning, troubleshooting, security, and networking fundamentals
  • Working knowledge and operational experience with distributed messaging queues, specifically Kafka
Job Responsibility
Job Responsibility
  • Maintain and improve the reliability, scalability, and performance of our production systems, targeting a high-availability environment
  • Design, implement, and maintain automation solutions for infrastructure provisioning, deployment, configuration management, and monitoring using Terraform and Jenkins
  • Administer, manage, and optimize our cloud infrastructure primarily hosted on AWS, focusing on cost efficiency and secure operations
  • Develop and maintain infrastructure-as-code using Puppet and/or Ansible to ensure consistent and reproducible environments
  • Participate in on-call rotation, troubleshoot and resolve critical production incidents, and conduct comprehensive post-mortems to prevent recurrence
  • Apply strong Linux administration skills to manage, patch, and secure operating systems and underlying infrastructure
  • Manage and optimize distributed messaging systems, specifically Kafka, ensuring high throughput and data integrity
What we offer
What we offer
  • Competitive compensation, variable bonus and performance reward opportunities, and retirement programs
  • Medical Insurance
  • Generous, flexible time off
  • Paid holidays, “wellness” days and company wide end of year break
  • Learning & Development stipend
  • Opportunities to volunteer and give back, including charitable donation match
  • Free resources and support for your mental wellbeing
Read More
Arrow Right

Kafka DevOps

As the central solution provider for information and telecommunications systems ...
Location
Location
Spain , Málaga
Salary
Salary:
Not provided
rewe-digital.com Logo
REWE digital
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Solid understanding of Kafka architecture (broker, topic, partition, replication, KRaft)
  • Experience installing, configuring, and scaling Kafka clusters
  • Knowledge of topic management, partitioning, and replication strategies
  • Familiarity with Kafka CLI tools like kafka-topics.sh, kafka-configs.sh, kafka-consumer-groups.sh
Job Responsibility
Job Responsibility
  • Implement and document user requirements for our Kafka platform
  • Set up, operate, optimize, and monitor modern integration systems—primarily Apache Kafka
  • Work in an agile environment alongside product teams and stakeholders
  • Handle second- and third-level support and conduct performance analyses
  • Advise users on system usage and administration
  • Participate in on-call rotation
What we offer
What we offer
  • Hybrid work and flexible working time
  • Conditions for a private health insurance
  • Ticket Restaurant
  • Professional development opportunities: English/German courses available and further IT education/trainings
  • Birthday day off
  • 25 days paid vacation
  • Fulltime
Read More
Arrow Right