Lead Data Engineer

Senior Data & AI/ML Engineer - GCP Specialization Lead

We are on a bold mission to create the best software services offering in the wo...

Location

United States , Menlo Park

Salary:

Not provided

techjays

Expiration Date

Until further notice

Requirements

GCP Services: BigQuery, Dataflow, Pub/Sub, Vertex AI
ML Engineering: End-to-end ML pipelines using Vertex AI / Kubeflow
Programming: Python & SQL
MLOps: CI/CD for ML, Model deployment & monitoring
Infrastructure-as-Code: Terraform
Data Engineering: ETL/ELT, real-time & batch pipelines
AI/ML Tools: TensorFlow, scikit-learn, XGBoost
Min Experience: 10+ Years

Job Responsibility

Design and implement data architectures for real-time and batch pipelines, leveraging GCP services such as BigQuery, Dataflow, Dataproc, Pub/Sub, Vertex AI, and Cloud Storage
Lead the development of ML pipelines, from feature engineering to model training and deployment using Vertex AI, AI Platform, and Kubeflow Pipelines
Collaborate with data scientists to operationalize ML models and support MLOps practices using Cloud Functions, CI/CD, and Model Registry
Define and implement data governance, lineage, monitoring, and quality frameworks
Build and document GCP-native solutions and architectures that can be used for case studies and specialization submissions
Lead client-facing PoCs or MVPs to showcase AI/ML capabilities using GCP
Contribute to building repeatable solution accelerators in Data & AI/ML
Work with the leadership team to align with Google Cloud Partner Program metrics
Mentor engineers and data scientists toward achieving GCP certifications, especially in Data Engineering and Machine Learning
Organize and lead internal GCP AI/ML enablement sessions

What we offer

Best in class packages
Paid holidays and flexible paid time away
Casual dress code & flexible working environment
Medical Insurance covering self & family up to 4 lakhs per person

New

Within COO Technology, Wells Fargo is seeking a Lead Data Engineer to help shape...

Location

United States , Iselin; Charlotte; Irving

Salary:

Not provided

Wells Fargo

Expiration Date

June 08, 2026

Requirements

5+ years of Database Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
5+ years of data management experience within Public Cloud (GCP, AWS, Azure)
5+ years of hands on experience of Python or Java, plus Spark SQL for building data pipelines, libraries, and automation tooling.
5+ years with orchestration tools (Cloud Composer/Airflow) and CI/CD (Cloud Build, Git‑based workflows) for data workloads

Job Responsibility

Design and implement scalable, secure data platforms on Google Cloud using managed services (BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Composer).)
Build reusable frameworks and tooling (ingestion, transformation, quality, orchestration) that can be adopted by multiple product and domain teams.
Enable self‑service data consumption and governance by standardizing patterns, templates, and platform capabilities rather than one‑off pipelines.
Design logical and physical data platform architectures leveraging BigQuery, Dataflow/Apache Beam, Dataproc/Spark, Pub/Sub, and Cloud Storage.
Define and implement standardized ingestion, transformation, and serving patterns (batch and streaming) as reusable blueprints.
Optimize cost, performance, and reliability of GCP data workloads (partitioning, clustering, storage classes, autoscaling strategies).
Build opinionated data ingestion frameworks (e.g., config‑driven pipelines, connectors, schema handling, error handling) on top of Dataflow, Dataproc, or Composer.
Develop shared transformation libraries in Python/SQL/Beam (e.g., common SCD patterns, data quality checks, masking/tokenization routines).
Provide orchestration capabilities via Cloud Composer or Cloud Workflows with reusable DAGs/templates and CI/CD integration.
Implement robust data modeling (dimensional, data vault, or canonical models) and semantic layers in BigQuery and related tools.

What we offer

Health benefits
401(k) Plan
Paid time off
Disability benefits
Life insurance, critical illness insurance, and accident insurance
Parental leave
Critical caregiving leave
Discounts and savings
Commuter benefits
Tuition reimbursement

Fulltime

!

Sr Data Platform Lead

At Amgen, if you feel like you’re part of something bigger, it’s because you are...

Location

India , Hyderabad

Salary:

Not provided

Amgen

Expiration Date

Until further notice

Requirements

Master's degree OR Bachelor's degree in computer science or engineering field and 8 to 13 years of relevant experience
Strong hands‑on experience with various capabilities of Databricks, from Compute to Storage and from Unity Catalog to Data Engineering to BI and AI/ML capabilities, with a focus on governance and enterprise enablement
Proven hands‑on experience with cloud platforms, with strong preference for AWS (experience with Azure or GCP also acceptable)
Experience leading Data Quality platform initiatives (e.g., Ataccama, Monte Carlo), including tool evaluation, implementation, enterprise-wide adoption, and integration with enterprise DQ solutions
Experience owning and managing Databricks platform environments, including workspace architecture, environment strategy (dev/test/prod), and lifecycle management at scale
Proven ability to establish and enforce platform standards and operating models, including cluster policies, cost management, and workload orchestration frameworks
Strong focus on platform enablement and developer experience, including building reusable frameworks, defining best practices, and supporting engineering teams in adopting the platform effectively
Exposure to AI/ML capabilities on Databricks, including enabling AI‑driven features or accelerating adoption of AI‑assisted engineering practices
Solid knowledge of SQL and relational / dimensional data modelling, sufficient to support platform integrations, governance, and observability use cases
Experience working with core AWS services such as EKS, EC2, S3, Lambda, Glue, EMR, RDS, and Redshift/Spectrum, particularly in platform or shared‑services contexts

Job Responsibility

Act as a platform lead for delivery of data platform capabilities that enable next-gen data platform architecture, with a strong focus on Databricks platform and DQ platform features and services
Evaluate and enable Databricks platform capabilities through technical assessments and proof‑of‑concepts (PoCs), ensuring alignment with next-gen data platform architectural patterns and enterprise standards
Design, build, and productionize reusable platform frameworks, accelerators, and reference implementations that can be leveraged by next-gen data platform delivery teams (excluding ownership of data pipeline architecture or implementation)
Enable data governance, metadata layer, and data bundle capabilities by designing and implementing platform‑level integrations between Databricks and Collibra, Amgen’s enterprise data governance platform
Build platform‑level tooling and automation to support proactive governance, cost optimization, and best‑practice enforcement across Databricks and related data platform services
Define and enable platform observability capabilities, including KPIs, metrics, and telemetry for monitoring performance, usage, reliability, and cost of Databricks services
Identify and implement governed self‑service platform capabilities for data engineers through self-service portal, using Python‑based microservices deployed on Docker and Kubernetes
Lead user enablement and adoption initiatives, including onboarding content, guided learning experiences, workshops, and best‑practice sharing for the Databricks user community
Drive engineering excellence and adoption of AI across platform capabilities and solutions built, promoting modern engineering practices, automation, and responsible use of AI‑driven features
Enable key business programs and strategic initiatives by translating initiative‑driven requirements into scalable, reusable data platform capabilities, in alignment with next-gen data platform principles

Fulltime