CrawlJobs Logo

Cloud Big-data Engineer

phasorsoft.com Logo

PhasorSoft Group

Location Icon

Location:
United States , Starkville

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

45.00 USD / Hour

Job Description:

An expert with 4-5 years of experience in Hadoop ecosystem and cloud- (AWS ecosystem/Azure), relational data stores, Data Integration techniques, XML, Python, Spark, and ETL techniques.

Requirements:

  • 4-5 years of experience in Hadoop ecosystem and cloud (AWS ecosystem/Azure)
  • Experience working with in-memory computing using R, Python, Spark, PySpark, Kafka, and Scala
  • Experience in parsing and shredding XML and JSON, shell scripting, and SQL
  • Experience working with Hadoop ecosystem - HDFS, Hive
  • Experience working with AWS ecosystem - S3, EMR, EC2, Lambda Cloud Formation, Cloud Watch, SNS/SQS
  • Experience with Azure – Azure Data Factory (ADF)
  • Experience working with SQL and No SQL databases
  • Experience designing and developing data sourcing routines utilizing typical data quality functions involving standardization, transformation, rationalization, linking, and matching
  • Work Authorization: H1, GC, US Citizen

Additional Information:

Job Posted:
December 11, 2025

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Cloud Big-data Engineer

Data Engineer

Are you a Data Engineer based in Austin, Texas, who is inspired by working with ...
Location
Location
United States , Austin
Salary
Salary:
100000.00 - 135000.00 USD / Year
beezwax.net Logo
Beezwax Datatools
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4-6 years of hands-on data modeling and data engineering experience
  • Strong expertise in dimensional modeling and data warehousing
  • Database design and development experience with relational or MPP databases such as Postgres/ Oracle/ Teradata/ Vertica
  • Experience in design and development of custom ETL pipelines using SQL and scripting languages (Python/ Shell/ Golang)
  • Proficiency in advanced SQL and performance tuning
  • Hands on experience with Big-Data platforms like Spark, Dremio, Hadoop, MapReduce, Hive etc
  • Experience with Java, Scala and Python
  • Experience with cloud computing platforms like AWS, Google Cloud
  • Experience working with APIs
  • Ability to learn and adapt to new tools and technologies
What we offer
What we offer
  • Competitive compensation
  • Retirement plan with employer matching
  • Excellent healthcare package with vision and dental
  • Support for productivity and continued learning in the forms of hardware, software, learning materials, training, and conferences
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

Senior Data Engineer – Dublin (Hybrid) Contract Role | 3 Days Onsite. We are see...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
solasit.ie Logo
Solas IT Recruitment
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience as a Data Engineer working with distributed data systems
  • 4+ years of deep Snowflake experience, including performance tuning, SQL optimization, and data modelling
  • Strong hands-on experience with the Hadoop ecosystem: HDFS, Hive, Impala, Spark (PySpark preferred)
  • Oozie, Airflow, or similar orchestration tools
  • Proven expertise with PySpark, Spark SQL, and large-scale data processing patterns
  • Experience with Databricks and Delta Lake (or equivalent big-data platforms)
  • Strong programming background in Python, Scala, or Java
  • Experience with cloud services (AWS preferred): S3, Glue, EMR, Redshift, Lambda, Athena, etc.
Job Responsibility
Job Responsibility
  • Build, enhance, and maintain large-scale ETL/ELT pipelines using Hadoop ecosystem tools including HDFS, Hive, Impala, and Oozie/Airflow
  • Develop distributed data processing solutions with PySpark, Spark SQL, Scala, or Python to support complex data transformations
  • Implement scalable and secure data ingestion frameworks to support both batch and streaming workloads
  • Work hands-on with Snowflake to design performant data models, optimize queries, and establish solid data governance practices
  • Collaborate on the migration and modernization of current big-data workloads to cloud-native platforms and Databricks
  • Tune Hadoop, Spark, and Snowflake systems for performance, storage efficiency, and reliability
  • Apply best practices in data modelling, partitioning strategies, and job orchestration for large datasets
  • Integrate metadata management, lineage tracking, and governance standards across the platform
  • Build automated validation frameworks to ensure accuracy, completeness, and reliability of data pipelines
  • Develop unit, integration, and end-to-end testing for ETL workflows using Python, Spark, and dbt testing where applicable
Read More
Arrow Right

Senior Data Engineer

Microsoft Cloud Operations + Innovation (CO+I) is the engine that powers Microso...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years’ experience in business analytics, data science, data modeling, or data engineering work
  • OR master’s degree in computer science, Math, Software Engineering, Computer Engineering, or related field and 3+ years’ experience in business analytics, data science, data modeling, or data engineering work
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • 8+ years of experience in data engineering with coding and debugging skills in C#, Python, and/or SQL
  • Deploying solutions in Azure Services & Managing Azure Subscriptions
  • Understanding and knowledge about big data and writing queries with Kusto/KQL
  • Understanding and knowledge about extracting data via REST APIs
  • Strong analytical skills with a systematic and structured approach to software design
  • 5+ years of experience in data science, analytics, or machine learning
  • 4+ years of experience in developing solutions with Microsoft Power Platform, including Power BI, Fabric, Power Automate & M365 Dataverse
Job Responsibility
Job Responsibility
  • Apply modification techniques to transform raw data into compatible formats for downstream systems
  • Utilize software and computing tools to ensure data quality and completeness
  • Implement code to extract and validate raw data from upstream sources, ensuring accuracy and reliability
  • Writes efficient, readable, extensible code from scratch that spans multiple features/solutions
  • Develops technical expertise in proper modeling, coding, and/or debugging techniques such as locating, isolating, and resolving errors and/or defects
  • Leverages technical proficiency of big-data software engineering concepts, such as Hadoop Ecosystem, Apache Spark, continuous integration and continuous delivery (CI/CD), Docker, Delta Lake, MLflow, AML, and representational state transfer (REST) application programming interface (API) consumption/development
  • Acquires data necessary for successful completion of the project plan
  • Proactively detects changes and communicates to senior leaders
  • Develops usable data sets for modeling purposes
  • Contributes to ethics and privacy policies related to collecting and preparing data by providing updates and suggestions around internal best practices
  • Fulltime
Read More
Arrow Right

Applications Development Senior Group Manager

This role will be part of the Risk Data team and is a senior management level po...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong academic record, ideally with a Bachelor’s or Master’s degree in Computer Science or engineering or related technical discipline
  • Proven experience in enterprise application development with full stack technologies
  • Strong Architect and hands on technical experience in implementing large volume real time complex solutions in Big Data Platforms & Public Cloud platforms
  • Experience in Data architecture, strong Software development fundamentals, data structures, design patterns, object-oriented principles
  • Experience in design and delivery of multi-tiered applications and high performance server side components
  • Skills on system performance tuning, high performance, low latency, multithreading and experience with Java server side programming
  • Preferred experience in Handling high volumes of data and working with In-memory databases and Caching solutions
  • Experience of building and leading teams, ideally with a global resource profile and demonstrated ability to deliver large projects efficiently and on time
  • Significant experience in large Financial Services Technology services companies is expected for this position
  • Hands-on development, architecture and leadership experience in real-time data engineering platforms implementation
Job Responsibility
Job Responsibility
  • Lead the efforts in Institutional Data Platform (ICG) that span multiple businesses, products and functions
  • Delivery of Price Risk related Data initiatives and Capital reporting (GSIB) related deliverables
  • Establish strong relationships with the global business stakeholders and ensure transparency of project deliveries
  • Actively identify and manage risks and issues, working with disparate teams to create mitigation plans and follow-through to resolution
  • Adhere to all key Project Management (PMQC) & Engineering Excellence standards
  • Ensure timely communications to Senior Technology Management and Business Partners in Front Office, Middle Office & other Operations functions
  • Drive the design and development of system architecture, work with end-users of the systems, and enhance the quality of deliverables
  • Ensure staff follows Citi documented policy and procedures as well as maintain desktop procedures and supporting documentation for filings on a current basis and in comprehensive manner
  • Ensure change is managed with appropriate controls, documentation, and approvals including implementation of new and revised regulatory reporting requirements
  • Manage and maintain all disaster recovery plans, oversee appropriate testing, and provide permit-to-operate for new applications
What we offer
What we offer
  • 27 days annual leave (plus bank holidays)
  • A discretional annual performance related bonus
  • Private Medical Care & Life Insurance
  • Employee Assistance Program
  • Pension Plan
  • Paid Parental Leave
  • Special discounts for employees, family, and friends
  • Access to an array of learning and development resources
  • Fulltime
Read More
Arrow Right

Principal Software Engineer

Principal Software Engineer role at Hewlett Packard Enterprise to design, develo...
Location
Location
United States , San Jose
Salary
Salary:
148000.00 - 340500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor or Masters degree in Computer science, Computer Engineering or a related field
  • 10+ years of experience in software engineering with a focus on Python, Go or Java
  • Strong understanding of RESTful API design and development
  • 2+ years of Experience working with large scale distributed systems based on either cloud technologies or Kubernetes
  • 2+ years of experience on event-driven technologies like Kafka and Apache Storm/Flink
  • 2+ years of experience in Big-data technologies like Apache spark/Databricks
  • Proficient in working with Redis and databases like Cassandra/Datastax
  • Must hold U.S. citizenship
Job Responsibility
Job Responsibility
  • Design, develop, and test software related to the cloud-based network configuration and reporting system
  • Solve complex problems and designing subsystems for Mist platform
  • Develop software for highly scalable and fault-tolerant cloud-scale distributed applications
  • Develop microservices using Python, and/or Go (golang)
  • Develop event-driven systems using Python and Java
  • Develop software for AIDE's real-time data pipeline and batch processing
  • Develop ETL pipelines aiding in training and inference of various ML models using big-data frameworks like Apache Spark
  • Build metrics, monitoring and structured logging into the product
  • Write unit, integration and functional tests
  • Participate in collaborative, DevOps style, lean practices
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive benefits suite supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Data Scientist 2

The Industry Solutions Delivery (ISD) Engineering & Architecture Group (EAG) is ...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Data Science, Computer Science, Engineering, Statistics, Operations Research, or a related field
  • 4 years of data science experience in business context
  • Ability to work independently, solve complex data science problems, design and code maintainable and scalable solutions, and effectively apply data science to business challenges
  • Hands-on software engineering experience (e.g. Python, Scikit, PyTorch, C++) with main established data science frameworks
  • Familiarity with building and deploying largescale AI solutions into production within a cloud environment
  • Experience dealing with internal and external stakeholders on large, complex projects.
Job Responsibility
Job Responsibility
  • Leverage data science and business domain knowledge to improve business performance, evaluate project plans, communicate business goals, and share insights with stakeholders
  • Acquire, prepare, and explore data through querying, visualisation, reporting techniques, and collaboration with other teams, ensuring data integrity
  • Apply machine learning and statistical analysis to develop models, train, optimise, and evaluate them, and communicate findings to stakeholders
  • Test, review, and improve models by analysing performance, incorporating feedback, and contributing to the review process
  • Write and debug efficient and scalable code while collaborating with engineering teams and integrate data models into customer systems
  • Understands big-data software engineering concepts, such as Hadoop Ecosystem, Apache Spark, CI/CD, Docker, Delta Lake, MLflow, AML, and REST API consumption/development
  • Demonstrates a strong commitment to Responsible AI, supporting customers, partners and internal stakeholders in building trustworthy AI solutions.
  • Fulltime
Read More
Arrow Right

Senior C++ Developer

VarSome.com is the world’s leading website for professional human genetics. VarS...
Location
Location
Greece , Athens
Salary
Salary:
Not provided
saphetor.com Logo
Saphetor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 4 years of commercial C++ development experience, using templates, STL containers, smart pointers, memory management and multithreading
  • BSc and/or MSc degree in Computer Science, Engineering or Mathematics
  • Hands-on experience with relational databases, optimizing queries and table schemas for optimal performance
  • Good experience with distributed cloud-based computing and big-data systems
  • Familiarity with agile methodologies and the ability to adapt to a fast-paced development environment, actively taking in code reviews, scrums, technical discussions etc
  • Very good understanding of fundamental application design principles and object-oriented design, in order to build & maintain a large high-quality code base
  • Excellent written & oral communication skills in English
Job Responsibility
Job Responsibility
  • Develop our core C++ applications and library for importing and analyzing genetic data
  • Write reusable, testable, and efficient code, including unit & regression tests
  • Take complete ownership of projects (ranging from a few days to a month) to deliver a working end-to-end implementation, including unit tests & testing
  • Optimize & architect our platform for maximum speed, high availability and scalability
  • Maintain & improve our internal high-performance clinical annotation tools and the custom databases built, optimized for genetics
  • Contribute to the documentation of software architecture, design and implementation details
What we offer
What we offer
  • A position in a fascinating healthcare growth domain, at the cutting edge of technology and research
  • A competitive compensation package combined with additional benefits
  • Endless learning opportunities, while transferring new technologies from academics to clinical practice all over the world
  • Fulltime
Read More
Arrow Right

Informatica Cloud Data Governance Catalog Specialist

We are looking for an experienced Informatica Cloud Data Governance Catalog Spec...
Location
Location
United States , Marysville
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency with tools like Informatica Cloud Data Governance Catalog and Cloud Data Quality
  • Hands-on experience in data modeling, metadata management, and large-scale data analysis
  • Familiarity with Collibra, Alation, and Glue Data Catalog
  • Strong understanding of entity-relationship modeling and data security practices
  • Expertise in business intelligence technologies such as Power BI and Tableau
  • Exceptional communication and presentation skills to effectively convey technical concepts
  • Analytical mindset with proven problem-solving abilities
  • Ability to work collaboratively as part of a team and build strong relationships with stakeholders
Job Responsibility
Job Responsibility
  • Create catalog quality reports to monitor and enhance data governance metrics across domains and sub-domains
  • Develop and showcase data governance dashboards tailored to different user roles, including Data Owners, Stewards, Engineers, and Privacy Officers
  • Collaborate with business and IT teams, including data stewards, catalog architects, and platform owners, to implement governance solutions
  • Execute profiling, sampling, and scanner setups using Informatica tools to ensure data quality
  • Apply expertise in metadata management, data modeling, and large-scale data analysis to support governance initiatives
  • Design and implement both traditional relational and modern big-data architectures based on organizational requirements
  • Utilize business intelligence tools such as Power BI and Tableau to create actionable insights and reports
  • Define compliance procedures and produce audit reports to meet regulatory requirements
  • Establish and support governance councils and operational frameworks using data catalog tools
  • Facilitate metadata ingestion and ensure adherence to data security and quality standards
What we offer
What we offer
  • medical, vision, dental, and life and disability insurance
  • eligible to enroll in our company 401(k) plan
Read More
Arrow Right