This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Data Engineer is responsible for building and maintaining the technical backbone that powers PublicRelay’s analytics, measurement, and narrative intelligence products. You will design and operate robust data pipelines, analytical systems, and derived metrics that enable our Insights & Analytics teams to deliver sophisticated, business-aligned analytics at scale for global communications and reputation leaders. You will work at the intersection of data engineering, applied data science, and AI/agentic tooling, with a mandate to ensure our data is clean, reliable, and production-ready while continually pushing the frontier of what our analytics stack can do. This role is approximately 70% engineering and systems development and 30% applied analytics and experimentation in partnership with Insights teams.
Job Responsibility:
Design, build, and maintain end-to-end data pipelines for media, reputation, and stakeholder datasets, from ingestion and scraping through preprocessing, normalization, and storage
Implement and enforce data hygiene standards so analytical datasets are cleanly integrated, well-documented, and easily accessible across the Insights & Analytics team
Operationalize and productionize advanced analytics workflows in Python and SQL including feature engineering, model scoring, and metric computation
Develop and maintain derived metrics and indices that can be reused across clients and products
Architect and monitor analytics systems ensuring accuracy, reliability, and performance at scale
Integrate third-party analytics tools and media data sources (e.g., social media APIs, alternative datasets) into the PublicRelay analytics stack
Implement rigorous logging, monitoring, and alerting for data pipelines and analytics services to catch issues early and minimize downtime
Apply core statistics and data science methods (regression, classification, clustering, time-series analysis, sampling, A/B testing) to support new metrics, models, and analytics features
Build and maintain ML workflows (training, evaluation, deployment) for tasks such as NLP, sentiment analysis, topic modeling, classification, and entity-level analytics
Design and implement AI- and LLM-powered agents to automate repetitive analytics tasks, data enrichment, tagging, and insight surfacing across large-scale media datasets
Experiment with agentic workflows (e.g., orchestrating multi-step pipelines, tool-using agents, retrieval-augmented systems) to increase speed, reliability, and sophistication of analytics outputs
Collaborate with data scientists and insights strategists to translate experimental models and prototypes into robust, production-grade systems
Build and optimize Tableau-ready data models that power client-facing dashboards and internal analytics tools
Ensure datasets are structured, documented, and performant for self-serve analysis in Tableau and SQL by non-engineering stakeholders
Partner with Visualization and Reporting teams to maintain consistent data definitions, metric logic, and calculation standards across dashboards and reports
Contribute to internal templates and component libraries (data sources, calculated fields, parameter patterns) that speed up dashboard development and maintain consistency
Master PublicRelay’s proprietary platforms and data schemas to design systems that fit seamlessly into existing workflows
Partner closely with Insights, Reporting, Engineering, and Client Success teams to understand how they use analytics and translate those needs into scalable data solutions
Participate in design reviews and technical scoping for new analytics capabilities, providing recommendations on architecture, data models, and feasibility
Act as a go-to technical partner for Insights teams during experiments and pilots, helping them test new metrics, methodologies, and frameworks rapidly and safely
Deliver all projects within agreed timelines while maintaining high standards for code quality, testing, and documentation
Conduct regular QA on source data, transformations, and metrics to ensure accuracy, completeness, and consistency across systems
Proactively identify technical and process bottlenecks
propose and implement improvements that increase the speed, reliability, and scalability of analytics delivery
Communicate status, risks, and tradeoffs clearly to technical and non-technical stakeholders
flag issues early with proposed options
Requirements:
Has deep expertise in data scraping, ingestion, preprocessing, normalization, and data management best practices in a production environment
Demonstrates strong command of Python and SQL, with experience building and maintaining data pipelines and analytics services
Applies statistical and data science methods confidently (e.g., regression, classification, clustering, time-series, sampling, hypothesis testing)
Has hands-on experience with ML and NLP in real-world settings (e.g., classification, sentiment analysis, topic modeling, entity extraction, summarization)
Is fluent with AI/LLM and agentic tools (e.g., using APIs, orchestration frameworks, or workflow engines) and is eager to experiment with new approaches
Is comfortable building Tableau-ready data models and collaborating with dashboard developers to ensure performance and usability
Is an innovative and creative thinker who enjoys connecting disparate data sources and systems into cohesive analytics solutions
Has a strong ability to explain technical concepts to a non-technical audience
Thrives in collaborative environments and partners effectively with insights professionals, analysts, and client-facing teams
Takes an ownership mindset, holds a high bar for quality, and is motivated by building systems that others can rely on
5-10 years of experience in data engineering, analytics engineering, or applied data science roles, ideally in analytics-heavy, product, or consulting environments
Advanced proficiency in Python (pandas, NumPy, SQLAlchemy or similar) and SQL (data modeling, performance optimization, complex queries)
Experience with modern data stack components (e.g., workflow/orchestration tools, cloud data warehouses, version control, CI/CD) in production settings
Practical experience with ML and statistics in Python (e.g., scikit-learn, statsmodels, NLP libraries) and deploying models into production workflows
Experience preparing data for Tableau or similar BI tools
understanding of best practices for semantic layers, extracts, and performance tuning
Strong understanding of data architecture fundamentals, including schema design, ETL/ELT patterns, and data quality frameworks
Excellent communication skills with the ability to explain complex technical concepts clearly to non-technical partners
Bachelor’s degree in Computer Science, Engineering, Statistics, Data Science, or a related field