CrawlJobs Logo

Senior Data Engineer

https://www.citi.com/ Logo

Citi

Location Icon

Location:
India, Chennai

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

Not provided

Job Description:

The Senior Data Engineer will be responsible for the architecture, design, development, and maintenance of data platforms, focusing on Python and PySpark for data processing and transformation. This role demands a strong technical leader with expertise in data engineering and collaboration across teams.

Job Responsibility:

  • Design, develop, and optimize data architectures, pipelines, and data models
  • Build, test, and deploy scalable ETL/ELT processes using Python and PySpark
  • Implement practices for data quality, governance, and security
  • Monitor, troubleshoot, and optimize data pipeline performance
  • Collaborate with DevOps and MLOps teams to manage data infrastructure
  • Provide guidance and mentorship to junior data engineers
  • Work closely with stakeholders to understand data requirements
  • Research and evaluate new data technologies, tools, and methodologies
  • Create and maintain comprehensive documentation for data pipelines and models

Requirements:

  • Bachelor's or Master's degree in Computer Science, Software Engineering, Data Science, or a related quantitative field
  • 5+ years of professional experience in data engineering
  • Extensive hands-on experience with Python for data engineering tasks
  • Proven experience with PySpark for big data processing and transformation
  • Proven experience with cloud data platforms (e.g., AWS Redshift, S3, EMR, Glue
  • Azure Data Lake, Databricks, Synapse
  • Google BigQuery, Dataflow)
  • Strong experience with SQL and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra)
  • Extensive experience with distributed data processing frameworks, especially Apache Spark
  • Expert proficiency in Python is mandatory
  • Strong SQL mastery is essential
  • Familiarity with Scala or Java is a plus
  • In-depth knowledge of data warehousing concepts, dimensional modeling, and ETL/ELT processes
  • Hands-on experience with at least one major cloud provider (AWS, Azure, GCP) and their data services
  • Familiarity with Docker and Kubernetes is a plus
  • Proficient with Git and CI/CD pipelines
  • Excellent problem-solving and analytical abilities
  • Strong communication and interpersonal skills
  • Ability to work effectively in a fast-paced, agile environment
  • Proactive and self-motivated with a strong sense of ownership

Nice to have:

  • Experience with real-time data streaming and processing using PySpark Structured Streaming
  • Knowledge of machine learning concepts and MLOps practices
  • Familiarity with data visualization tools (e.g., Tableau, Power BI)
  • Contributions to open-source data projects
What we offer:
  • Global workforce benefits
  • Support for well-being, career growth, and work-life balance

Additional Information:

Job Posted:
November 12, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.