This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a Staff Data Scientist to lead advanced data science and R&D efforts for the ID Graph, Socure’s foundational platform powering identity intelligence across our product ecosystem. This Staff-level role operates at platform scale, with responsibility extending beyond a single model or pipeline. You will work at the intersection of graph modeling, machine learning, and product innovation, collaborating closely with Engineering, Product Management, and multiple product teams. The ID Graph is the core intelligence backbone for many downstream products, and your work will directly impact Socure’s ability to deliver trusted, scalable, and explainable identity solutions.
Job Responsibility:
Lead the evaluation and continuous improvement of entity resolution and entity linking pipelines
Debug new builds, identify anomalies, and recommend modeling or system-level improvements
Define, implement, and maintain scalable performance and quality metrics, leveraging automation and LLM-based approaches where appropriate
Partner with Engineering to optimize entity linking and ranking systems using Learning-to-Rank and related techniques
Design methods to assess and classify entity confidence and quality across the graph
Design and implement a comprehensive data quality framework for graph-based identity data
Use data quality insights to guide modeling decisions, experimentation strategy, and product prioritization
Identify and operationalize generalized, high-impact predictive signals derived from graph structure, temporal dynamics, and relational patterns
Develop scalable approaches to link prediction, label propagation, and semi-supervised learning within the ID Graph
Explore and evaluate advanced graph modeling techniques, including graph-based ML, knowledge graph methods, and Graph Neural Networks (GNNs), when appropriate
Focus on durable abstractions rather than one-off features, ensuring solutions are explainable, compliant, and reusable across multiple products
Collaborate closely with Engineering, Product Management, Compliance, and downstream product teams
Act as a technical leader within the Identity organization, influencing modeling standards, experimentation rigor, and best practices
Translate complex technical findings into clear insights and recommendations for both technical and non-technical stakeholders
Support the launch of new product capabilities built on top of the ID Graph
Requirements:
Strong proficiency in Python and PySpark
Deep experience with: Classification models
Learning-to-Rank
Anomaly Detection
Statistical Modeling
Experience building and maintaining production-grade ML systems at scale
Hands-on experience with Databricks
Familiarity with graph databases and query languages such as NeptuneDB and OpenCypher
Experience with graph processing frameworks (e.g., GraphFrames)
Master’s or PhD in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field
5+ years of experience in applied data science, machine learning, or artificial intelligence, with a focus on graph-based modeling and large-scale data systems
Nice to have:
Experience applying LLMs for evaluation, automation, or signal discovery
Familiarity with Knowledge Graphs and Graph Neural Networks (GNNs)