Early Career Health Data Scientist - BHF DSC
Early Career Health Data Scientist - BHF DSC
26 March 2026
Purpose of the post
The Early Career Health Data Scientist will join our Health Data Science team and will contribute to the development of scalable, reusable resources that support researchers with the data curation phase of their research projects to produce high-quality, analysis-ready data. These resources may include:
- Data dictionaries, dataset summaries and shared exploratory analyses and insights that inform researchers about datasets and how they can be used on research projects.
- Coding tutorials, guidance notes, and worked examples to help researchers develop the technical skills needed to curate data within Secure Data Environments.
- Re-usable code, functions, and data curation pipelines that researchers can adapt for their own projects, reducing duplication and accelerating the data curation phase of their project. An example data curation pipeline for research projects being undertaken within the NHS England SDE can be found in the Centre’s GitHub here.
- Curated datamethods. These are methods to produce cleaned and enhanced views of datasets, designed to integrate with our data curation pipelines to prevent repeated reimplementation of equivalent logic.
The post-holder will also provide direct, hands-on support to researchers either by providing guidance and signposting to existing data curation resources relevant to their project, or by providing targeted, bespoke development of data curation pipelines to generate analysis ready data. The post-holder will also be required to perform analyses of data for quality control purposes and to help better understand the utility of the data, and how it can be appropriately used for research purposes.
This post is an attractive career development opportunity, which would suit a health data scientist, data analyst, data engineer with previous experience of data wrangling and curation of health data for research projects, who wishes to expand and deepen their expertise in large-scale, linked health data, and collaborative research environments.
Main responsibilities
- Providing data engineering and data curation support in secure data environments (SDEs) and trusted research environments (TREs) to produce robust, analysis-ready datasets.
- Contributing to the development, testing, and maintenance of data curation pipelines and shared resources under the supervision of senior colleagues.
- Developing and applying expertise in the assessment of data quality, completeness, and data utility of the various routinely collected health datasets across the four devolved nations, including contributing to early feasibility and exploratory assessments to inform study design.
- Summarising and disseminating findings and lessons from data quality and data utility assessments to inform research design and appropriate use of routinely collected data.
- Under the supervision of the senior colleagues, writing, organising and maintaining support documentation for linked data resources (e.g. data dictionaries, variable mapping tables, data access process documentation, and Git repositories).
- Carry out technical validation checks on linked data sources (e.g. duplicates, linkage errors, temporal inconsistencies) and develop reusable functions to check these data rigorously for errors and inconsistencies.
- Working with relevant researchers to identify and apply appropriate existing and novel phenotype definitions and algorithms from linked national health data.
- Preparing clear numerical summaries and visualisations to communicate findings (e.g. data characteristics, quality, and decision making) to researchers when curating data.
- Preparing and presenting results in oral and written reports, technical notes, and academic publications.
- Actively participating and attending the regular Centre and project meetings, reporting on progress and presenting analytical results.
- Demonstrating a strong commitment to open source, transparent, and reproducible research, as the post will involve releasing tools, code, documentation under an open-source licence.
To view full JD, please download the attachment.
Please note, as we are a UK-based organisation, applicants must be living in, and eligible to work in, the UK. We are unable to sponsor or take over sponsorship of an employment Visa at this time.
We reserve the right to close this vacancy early if we receive sufficient applications for the role. Therefore, if you are interested, please submit your application as early as possible.
We politely request no contact from recruitment agencies or media sales. We do not accept speculative CVs from recruitment agencies nor accept the fees associated with them.