Data Analytics Intermediate Engineer, Citi

Citi

Location:
United States, Irving

Category:
IT - Software Development

Contract Type:
Employment contract

Salary:

76230.00 - 106370.00 USD / Year

Save Job

Apply Position

Job Description:

The Data Analytics Intermediate Engineer is a hands-on technical contributor who designs, builds, and maintains scalable data pipelines and infrastructure within large enterprise data environments. This role supports various analytical and operational needs, collaborating with cross-functional teams and applying solid knowledge of big data technologies, programming, and data governance practices.

Job Responsibility:

design and implement data ingestion, transformation, and cleansing pipelines using PySpark, SQL, and Python/Java
work on structured and unstructured datasets stored in HDFS, Hive, Parquet, or cloud-based storage
optimize existing data workflows and jobs for performance, scalability, and reliability
support batch and streaming data processing frameworks across Big Data platforms (e.g., Hadoop, Spark, Hive, Kafka)
integrate and process data from multiple sources including APIs, flat files, relational databases, and cloud-native services
apply data modeling, partitioning, and file format best practices for efficient storage and querying
implement monitoring, logging, and alerting for production pipelines and participate in on-call rotation if required
document pipeline logic, data lineage, and schema changes to ensure data transparency and auditability
collaborate with data analysts, data scientists, and product owners to translate business needs into scalable data solutions
assist in proof-of-concept efforts for new technologies and data integration strategies

Requirements:

2–5 years of experience in a data engineering, ETL development, or big data role
strong programming experience in Python (or Java) for data manipulation and automation
advanced proficiency in SQL (window functions, joins, CTEs, optimization techniques)
experience working with Apache Spark (PySpark) in a distributed environment
hands-on with Hadoop ecosystem tools (Hive, HDFS, Oozie, etc.)
familiarity with Git, Jenkins, Airflow, or other CI/CD and orchestration tools
exposure to cloud platforms (AWS Glue/EMR, Azure Data Factory, GCP Dataflow) is a plus
knowledge of basic ML workflows (feature engineering, model inputs/outputs) is desirable but not mandatory

Nice to have:

exposure to cloud platforms (AWS Glue/EMR, Azure Data Factory, GCP Dataflow)
knowledge of basic ML workflows (feature engineering, model inputs/outputs)

What we offer:

medical, dental & vision coverage
401(k)
life, accident, and disability insurance
wellness programs
paid time off packages including planned time off (vacation), unplanned time off (sick leave), and paid holidays
discretionary and formulaic incentive and retention awards

Additional Information:

Job Posted:
April 29, 2025

Expiration:
May 05, 2025

Employment Type:

Fulltime

Work Type:

Hybrid work

View All Jobs In This Company

Job Link Share:

Data Analytics Intermediate Engineer