This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a Database Engineer to support the performance, stability, and scalability of a modern data platform in Irvine, California. This role focuses on building dependable data pipelines, improving platform operations, and ensuring data assets remain governed, accessible, and cost-efficient in a production environment. The ideal candidate brings strong hands-on experience with Databricks, Spark, Python, SQL, and SQL Server, along with the ability to troubleshoot issues across orchestration, storage, and source system connectivity.
Job Responsibility
Oversee daily health and uptime of Databricks pipelines, job schedules, compute resources, Delta Lake datasets, and warehouse environments to maintain consistent platform performance
Design, build, and refine Spark-based data processing workflows in Databricks notebooks across bronze, silver, and gold layers, including incremental ingestion patterns using Auto Loader and streaming frameworks when needed
Support and enhance workflow orchestration through Databricks Workflows and Apache Airflow, with attention to data delivery timelines, reliability targets, and operational continuity
Administer data governance controls in Unity Catalog by managing schemas, permissions, lineage visibility, and catalog structures within a regulated environment
Maintain and optimize Microsoft SQL Server databases used as source and downstream data stores, improving query execution, indexing strategy, schema organization, and overall system efficiency
Track compute utilization and platform spend by implementing monitoring, alerting, and resource tuning practices that improve predictability and cost control
Investigate and resolve end-to-end production issues involving failed jobs, data validation concerns, evolving schemas, and connectivity or authentication problems with upstream systems
Contribute to shared operational ownership by creating documentation, runbooks, and support procedures that strengthen team coverage and reduce dependency on any one individual
Produce clear technical documentation for pipeline design, platform operations, and maintenance practices to support long-term scalability and team collaboration
Requirements
3+ years of practical data engineering experience supporting production-grade data platforms
Strong hands-on expertise with Databricks, including notebooks, jobs, workflows, cluster configuration, SQL warehouses, and Unity Catalog administration
Proven ability to develop and troubleshoot distributed data transformations using Python, PySpark, and SQL
Solid experience working with Microsoft SQL Server, including database tuning, indexing, schema management, and query performance analysis
Background in Apache Airflow for building, scheduling, and debugging DAG-driven workflows in live environments
Experience integrating data from SaaS or operational source systems through batch, bulk, or streaming APIs
Ability to monitor, support, and troubleshoot active data pipelines with a strong focus on reliability and issue resolution
Strong written and verbal communication skills, including the ability to create clear and maintainable technical documentation
What we offer
medical, vision, dental, and life and disability insurance