This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
PepsiCo operates in an environment undergoing immense and rapid change. Big data and digital technologies are driving business transformation that is unlocking new capabilities and business innovations in areas like eCommerce, mobile experiences, and IoT. The key to winning in these areas is leveraging enterprise data foundations built on PepsiCo’s global business scale to enable business insights, advanced analytics, and new product development. PepsiCo’s Data Management and Operations team is tasked with the responsibility of developing quality data collection processes, maintaining the integrity of our data foundations, and enabling business leaders and data scientists across the company to have rapid access to the data they need for decision-making and innovation.
Job Responsibility:
Own data pipeline development end-to-end, spanning data modeling, testing, scalability, operability, and ongoing metrics
Ensure that we build high-quality software by reviewing peer code check-ins
Define best practices for product development, engineering, and coding as part of a world-class engineering team
Collaborate in architecture discussions and architectural decision-making that is part of continually improving and expanding these platforms
Lead feature development in collaboration with other engineers
validate requirements/stories, assess current system capabilities, and decompose feature requirements into engineering tasks
Focus on delivering high-quality data pipelines and tools through careful analysis of system capabilities and feature requests, peer reviews, test automation, and collaboration with other engineers
Develop software in short iterations to quickly add business value
Introduce new tools/practices to improve data and code quality
this includes researching/sourcing 3rd party tools and libraries, as well as developing tools in-house to improve workflow and quality for all data engineers
Support data pipelines developed by your team through good exception handling, monitoring, and, when needed by debugging production issues
Requirements:
4+ years of overall technology experience that includes at least 3+ years of hands-on software development, data engineering, and systems architecture
3+ years of experience in SQL optimization and performance tuning
Experience with data modeling, data warehousing, and building high-volume ETL/ELT pipelines
Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
Experience with data profiling and data quality tools like Apache Griffin, Deequ, or Great Expectations
Current skills in the following technologies: Python
Orchestration platforms: Airflow, Luigi, Databricks, or similar
Relational databases: Postgres, MySQL, or equivalents
MPP data systems: Snowflake, Redshift, Synapse, or similar
Cloud platforms: AWS, Azure, or similar
Version control (e.g., GitHub) and familiarity with deployment, CI/CD tools
Fluent with Agile processes and tools such as Jira or Pivotal Tracker
Nice to have:
Experience with running and scaling applications on the cloud infrastructure and containerized services like Kubernetes is a plus
Understanding of metadata management, data lineage, and data glossaries is a plus
What we offer:
A business development incentive equity may be awarded based on eligibility and performance
Paid time off subject to eligibility, including paid parental leave, vacation, sick, and bereavement
Medical, Dental, Vision, Disability, Health, and Dependent Care Reimbursement Accounts, Employee Assistance Program (EAP), Insurance (Accident, Group Legal, Life), Defined Contribution Retirement Plan