This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Data, Automation, and Predictive Sciences (DAPS) organization within Research and Development Technologies at GSK works to harness the power of GSK’s data-as-an-asset to drive research productivity. The Cheminformatics (CIX) group is focused on capitalizing on our chemistry and biology data resources to develop, integrate and embed advanced computational methods and predictive in silico models that accelerate the discovery of medicines. As a member of the CIX team, you will be instrumental in marshaling internal datasets and delivering machine learning models built on the strength of those datasets alongside CIX colleagues and business partners to be deployed within the framework of GSK’s small molecule discovery platform.
Job Responsibility:
Collaborate with our business partners to engineer pipelines that process elements of GSK’s large proprietary datasets and land model-ready data in the hands of the CIX team
Adapt and apply machine learning, active learning, and advanced cheminformatics tools at scale to build robust predictive models for use on drug-discovery programs
Contribute to and validate code implementing state-of the-art, production quality methods that accelerate, automate and improve decision making on drug discovery programs by integrating with agentic workflows
Prepare and present results of key validation experiments, details of capability builds, and developments on active drug discovery projects to internal and external groups in a way that is both informative and accessible to the non-subject matter expert
Work with others within a multidisciplinary matrix team that spans different organizations and geographies to execute on joint objectives
Requirements:
PhD or MSc in Cheminformatics, Computational Chemistry, Informatics, Life Sciences, Mathematics or equivalent
Experience in computational sciences including knowledge of machine learning, virtual screening, and cheminformatics methods applied to drug design across different modalities
Experience programmatically collecting, combining, mining and analyzing complex biological and chemical data to build predictive models, and deploying these methods as pipelines
Experience utilizing computer programming and scripting languages such as Python, Java, C/C++, or R with knowledge of basic software development practices
Experience with chemical toolkits such as ChemAxon or RDKit and scientific pipelining tools such as Pipeline Pilot or KNIME
Nice to have:
Experience applying DNNs to drug discovery-related tasks, such as de-novo molecular generation, reaction and retrosynthetic prediction, or property prediction
Experience applying modern experimental design and acquisition strategies to library design and high throughput chemistry including methods such as Bayesian optimization
Experience utilizing software development tooling such as GitHub, Azure DevOps, automation and containerization
Experience applying cheminformatics & predictive modelling methods in project support scenarios
Experience working alongside or within Cloud engineering teams, and deploying agents and LLMs at scale
Evidence of strong critical thinking skills, problem-solving & high learning agility