CrawlJobs Logo

Python PySpark Engineer

Poland, Wroclaw 100.00 - 160.00 PLN / Hour · Job Posted January 29, 2026
Apply Position
Job Link Share

Job Responsibility

You will play a key role in migrating Building ETL/ELT processes in the Client’s Palantir Foundry infrastructure under the Data Sphere Program, establishing Foundry as the primary Data Lake platform for the Healthcare Commercial

Requirements

  • Excellent knowledge of PySpark / Python
  • Great knowledge of ETL/ELT processes
  • Experience with working with data lake systems (preferably Palantir Foundry) for data ingestions
  • Practice with creating documentation on the Confluence platform
  • Ability to use ticketing systems such as JIRA and/or Azure DevOps
  • Familiarity with Snowflake infrastructure as an advance
  • Ability to work in an agile BI team (DevOps) and to share skills and experience
  • Fluency in English

Nice to have

Data Lake -Palantir Foundry

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Python PySpark Engineer

8 matching positions

Senior Data Software Engineer (Python & PySpark) - Vice President

The Senior Data Software Engineer is a senior level position responsible for est...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field
  • 7+ years of experience in data engineering, with a strong focus on Python and big data technologies
  • Proven expertise in designing and implementing large-scale data processing solutions using PySpark
  • Extensive experience with distributed computing frameworks like Apache Spark
  • Strong understanding of data warehousing concepts, dimensional modeling, and ETL/ELT principles
  • Proficiency in SQL and experience with various relational and NoSQL databases
  • Experience with cloud platforms (AWS, Azure, GCP) and their data services (e.g., S3, ADLS, Google Cloud Storage, Redshift, Snowflake, BigQuery, Databricks)
  • Familiarity with workflow orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Step Functions)
  • Experience with version control systems (e.g., Git)
  • Excellent problem-solving, analytical, and communication skills.
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.
  • Fulltime
Read More
Arrow Right

Senior Python Pyspark Engineer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8 - 10 years of relevant experience
  • Experience in systems analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Programming Languages:Python, PySpark
  • Data Lake Table Format: Apache Iceberg
  • Data Orchestration:Apache Airflow
  • Data Visualization: Tableau
  • Big Data Processing: Apache Spark
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision.
  • Can exercise independence of judgement and autonomy.
  • Acts as SME to senior stakeholders and /or other team members.
  • Fulltime
Read More
Arrow Right

Data Engineer - Python, AI

We are looking for a mid-level Python Developer with combined experience in Data...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10–12 years of hands-on Python programming experience
  • Strong fundamentals in Python, OOP, and design patterns
  • Experience with NLP libraries such as Flair, BERT, HuggingFace Transformers, or similar
  • Solid experience with PySpark, Pandas, PyArrow, and distributed data pipelines
  • Experience building APIs using Flask (FastAPI is a plus)
  • Experience with MLflow for model tracking and deployment
  • Good understanding of CI/CD practices and Git workflows
  • Experience working with Redis or similar in-memory stores
  • Experience with Autosys JILs for job scheduling
  • Comfortable with Linux command line and shell scripting
Job Responsibility
Job Responsibility
  • Develop and optimize ETL/data processing jobs using PySpark, Pandas, PyArrow, and related libraries
  • Build and maintain NLP pipelines using Flair, BERT, and LLM-based models
  • Develop scalable ingestion and data transformation pipelines for AI and analytics use cases
  • Build and maintain Flask-based APIs for model inference and service integrations
  • Use regular expressions for text cleaning, parsing, and NLP preprocessing
  • Integrate caching and fast lookups using Redis
  • Manage and deploy ML models using MLflow for tracking and versioning
  • Support CI/CD workflows using GitHub, LightSpeed Enterprise, and deployment pipelines
  • Create and maintain Autosys JILs for job scheduling and automation
  • Use basic Linux commands for troubleshooting, operations, and deployment tasks
  • Fulltime
Read More
Arrow Right

Senior Python Engineer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ Years of Experience in Python Development
  • Experience is working with High Volume Data Pipeline development
  • Experience working with PyIceberg, PySpark and Polars data frames
  • Fully Hands on Development
  • Needs to be able to participatein Design and Architectire discussions
  • Should be able to work with distributed geographical teams
  • Bachelor's degree/University degree or equivalent experience
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right

Pyspark Engineer

Job Title: Pyspark Engineer Location: Irving, TX (ONSITE) F2F Interview Full Ti...
Location
Location
United States , Irving
Salary
Salary:
135000.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PySpark Developer with 5-10 years’ experience in data engineering practice
  • Strong experience in Apache Spark framework including good understanding of core concepts, performance optimization and industry best practices
  • Proficient in PySpark with hands-on coding experience and ability to implement complex business level transformations
  • Familiarity with unit testing, object-oriented programming (OOPS) concepts and interpreting test results
  • Proficient to write complex and efficient SQL queries to extract the business-critical insights from large-scale data
  • Experience with scheduling of the transformation jobs as per business requirement
  • Perform root-cause analysis and troubleshoot errors on data pipelines, evaluating data quality issues, and implementing corrective fixes
Job Responsibility
Job Responsibility
  • Designing, developing and maintaining scalable data pipelines
  • Optimizing data workflows and ensuring the integrity and availability of data for business intelligence
  • Collaborate with the stakeholders and analysts to understand data requirement and deliver robust, creative and innovative solutions
  • Fulltime
Read More
Arrow Right

Python PySpark Developer

Key Responsibilities: • Designing and developing robust PySpark applications for...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
overturerede.in Logo
Overture Rede
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s/Master’s degree in Computer Science, Engineering, Proven experience (3-10 years) in Python development with a focus on PySpark
  • Strong understanding of distributed computing principles and experience with Apache Spark
  • Proficiency in SQL and experience with relational databases (MySQL, PostgreSQL, etc.)
  • Experience with data serialization formats such as JSON, Parquet, Avro
  • Excellent problem-solving skills and ability to work independently or as part of a team
  • Good communication skills with the ability to effectively collaborate with stakeholders
Job Responsibility
Job Responsibility
  • Designing and developing robust PySpark applications for large-scale data processing
  • Building and optimizing data ingestion, transformation, and storage processes
  • Implementing efficient algorithms and data structures for distributed computing
  • Collaborating with cross-functional teams to integrate data-driven solutions into business processes
  • Troubleshooting performance bottlenecks and ensuring high availability and reliability of data pipelines
  • Writing and optimizing SQL queries for data extraction and manipulation
  • Fulltime
Read More
Arrow Right

Hadoop PySpark, Python, Apache Kafka

Role: Hadoop PySpark, Python, Apache Kafka. FTE only. Architectural Leadership, ...
Location
Location
United States , Charlotte, NC / New York, NY / Dallas, TX / Jersey City, NJ
Salary
Salary:
160000.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 9 years experience in software development
  • Strong experience with Hadoop ecosystem (HDFS, Hive, Spark)
  • Proficiency in PySpark for distributed data processing
  • Advanced programming skills in Python
  • Hands-on experience with Apache Kafka for real-time streaming
  • Frontend development using Angular (TypeScript, HTML, CSS)
  • Expertise in designing scalable, secure, and high-performance systems
  • Familiarity with microservices, API design, and cloud-native architectures
  • Knowledge of CI/CD pipelines, containerization (Docker/Kubernetes)
  • Exposure to cloud platforms (AWS, Azure, GCP)
Job Responsibility
Job Responsibility
  • Define end-to-end architecture for data platforms, streaming systems, and web applications
  • Ensure alignment with enterprise standards, security, and compliance requirements
  • Evaluate emerging technologies and recommend adoption strategies
  • Design and implement data ingestion, transformation, and processing pipelines using Hadoop, PySpark, and related tools
  • Optimize ETL workflows for large-scale datasets and real-time streaming
  • Integrate Apache Kafka for event-driven architectures and messaging
  • Build and maintain backend services using Python and microservices architecture
  • Develop responsive, dynamic front-end applications using Angular
  • Implement RESTful APIs and ensure seamless integration between components
  • Work closely with product owners, business analysts, and DevOps teams
  • Fulltime
Read More
Arrow Right

Hadoop PySpark, Python, Apache Kafka

Architectural Leadership: Define end-to-end architecture for data platforms, str...
Location
Location
United States , Charlotte, NC / New York, NY/ Dallas, TX / Jersey City, NJ
Salary
Salary:
160000.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 9 years experience
  • Strong experience with Hadoop ecosystem (HDFS, Hive, Spark)
  • Proficiency in PySpark for distributed data processing
  • Advanced programming skills in Python
  • Hands-on experience with Apache Kafka for real-time streaming
  • Frontend development using Angular (TypeScript, HTML, CSS)
  • Expertise in designing scalable, secure, and high-performance systems
  • Familiarity with microservices, API design, and cloud-native architectures
  • Knowledge of CI/CD pipelines, containerization (Docker/Kubernetes)
  • Exposure to cloud platforms (AWS, Azure, GCP)
Job Responsibility
Job Responsibility
  • Define end-to-end architecture for data platforms, streaming systems, and web applications
  • Ensure alignment with enterprise standards, security, and compliance requirements
  • Evaluate emerging technologies and recommend adoption strategies
  • Design and implement data ingestion, transformation, and processing pipelines using Hadoop, PySpark, and related tools
  • Optimize ETL workflows for large-scale datasets and real-time streaming
  • Integrate Apache Kafka for event-driven architectures and messaging
  • Build and maintain backend services using Python and microservices architecture
  • Develop responsive, dynamic front-end applications using Angular
  • Implement RESTful APIs and ensure seamless integration between components
  • Work closely with product owners, business analysts, and DevOps teams
  • Fulltime
Read More
Arrow Right