This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Data Engineer on the Databricks platform at Booz Allen, you'll support mission-critical data initiatives for defense, intelligence, and civil clients. You'll help design, build, and maintain scalable data pipelines in modern cloud environments, enabling analytics, AI, ML, and decision advantage at scale. Join a team that accelerates national security outcomes through data engineering excellence. Work with cutting-edge tools like Databricks, Delta Lake, and PySpark to transform raw data into actionable insights for America's toughest challenges. Work with us to use data for good.
Requirements
2+ years of experience with Databricks platform including compute and resource management and data engineering capabilities
2+ years of experience with Python, Pyspark, and SQL programming language
Experience creating and enforcing cluster usage and policies
Experience managing cluster-scoped or global init scripts and handling library dependencies such as PyPI, Maven, or CRAN
Experience with tiered name spacing such as catalog, schema, table, and configuring Role-Based Access Control (RBAC) for users and Service Principals
Knowledge of identity management including managing users and groups on Databricks or a cloud-native data environment
Ability to implement and enforce workspace policies and organizational standards
Ability to respond to support requests such as login issues and user inquiries in a timely manner
Secret clearance
HS diploma or GED
Nice to have
Experience with Infrastructure as Code (IaC) platform such as Terraform to automate provisioning and configuration of workspaces
Experience using Databricks CLI and REST APIs for bulk operations
Experience in regulated environments such as DoW or Intelligence Community
Experience implementing data engineering concepts such as developing ETL, medallion architecture, and pipelines
Experience tuning clusters and optimizing Spark SQL logic to improve performance and cost
Knowledge of Zero Trust Architecture (ZTA) principles
Knowledge of Spark execution
Databricks Certified Platform Administrator or Certified Data Engineer Certification