This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a Senior Infrastructure - Kafka Engineer to join a high-performing data engineering team supporting large-scale, event-driven data platforms. This role is ideal for a seasoned engineer with deep experience in Apache Kafka / Confluent Kafka, messaging platforms, SQL/NoSQL databases, and cloud infrastructure, who can lead engineering, operations, and automation efforts across complex enterprise environments. This is a 6-month contract-to-hire opportunity supporting a hybrid work model in Phoenix, AZ. The ideal candidate is a hands-on infrastructure engineer with strong experience designing resilient Kafka environments, building real-time data pipelines, and supporting production systems in fast-paced enterprise settings.
Job Responsibility:
Administer, configure, and troubleshoot Kafka clusters across on-prem and cloud environments, including broker and cluster configuration, partitioning, and performance tuning
Design and implement scalable, highly available Kafka infrastructure, including disaster recovery and multi-environment strategies
Integrate Kafka with upstream and downstream systems using Kafka Connect and related connectors, including MQ, MongoDB, Oracle, SQL Server, PostgreSQL, and MySQL
Build and support real-time data pipelines using Kafka producers and streaming consumers such as Spark Streaming and Kafka Streams
Automate infrastructure provisioning and configuration across environments using Terraform and modern DevOps practices
Deploy and manage Kafka components and clients in production and disaster recovery environments, ensuring resilience and recoverability
Lead a small team of engineers and technicians in monitoring, diagnosis, and remediation of infrastructure issues
Implement and maintain comprehensive monitoring, logging, and alerting using tools such as Splunk, Datadog, and Grafana
Perform proactive health checks and capacity planning to identify and resolve issues before they impact service
Serve as a primary point of contact for daily operations, major incidents, and escalations related to Kafka and associated infrastructure
Develop, maintain, and continuously improve runbooks and playbooks for incident response, maintenance, and recurring operational tasks
Analyze support trends and incident patterns to reduce downtime and drive root-cause resolution
Ensure infrastructure and platform changes comply with internal standards, security policies, and applicable regulatory requirements
Partner with security, networking, application, and data engineering teams to design and operate secure, compliant, event-driven architectures
Contribute to standards, best practices, and technical documentation for Kafka, messaging, and integration patterns
Participate in agile ceremonies and help influence technical direction for streaming and integration platforms
Requirements:
7+ years of experience in infrastructure engineering with a strong focus on: Kafka administration across on-prem and cloud environments
Kafka ecosystem components including brokers, topics, consumer groups, replication, and failover
Messaging systems such as MQ
SQL and NoSQL database integration
Proven experience designing, deploying, and scaling Kafka clusters and connector infrastructure in production and DR environments
Hands-on experience building real-time data pipelines using Kafka producers and streaming consumers such as Spark Streaming
Strong proficiency with at least one major cloud platform: AWS, GCP, or Azure
Experience with event-driven architectures, containerization, and DevOps practices
Experience with observability and monitoring tools such as Splunk, Datadog, and Grafana
Solid understanding of networking, Linux/Windows operating systems, and core diagnostic tools
Proficiency with source control tools such as SVN and Git
Scripting and programming experience with tools such as PowerShell, Bash, Python, or Perl
Demonstrated ability to analyze complex issues, make sound decisions with limited information, and drive issues through resolution
Strong communication, customer service, and collaboration skills with the ability to work effectively across cross-functional technical teams
Nice to have:
Experience with additional enterprise monitoring and infrastructure support tools
Experience working in highly regulated enterprise environments
Prior exposure to large-scale data engineering or integration platforms