Skip to main content
Blog

Why we’re getting our data teams to RAP

Reproducible analytical pipelines (RAP) help ensure all published statistics meet the highest standards of transparency and reproducibility. Sam Hollings and Alistair Bullward share their insights on adopting RAP and give advice to those starting out.

Reproducible analytical pipelines (RAP) are automated statistical and analytical processes that apply to data analysis. It’s a key part of national strategy and widely used in the civil service. 

The authors Alistair Bullward and Sam Hollings.

Over the past year, we’ve been going through a change programme and adopting RAP in our Data Services directorate. We’re still in the early stages of our journey, but already we’ve accomplished a lot and had some hard-learnt lessons.

This is about analytics and data, but knowledge of RAP isn’t just for those cutting code day-to-day. It’s crucial that senior colleagues understand the levels and benefits of RAP and get involved in promoting this new way of working and planning how we implement it.

This improves the lives of our data analysts and the quality of our work.

What is RAP and why is it important?

In a nutshell, it is all about making our existing data processes more automated, reproducible, transparent, maintainable, efficient and robust. This improves the lives of our data analysts, as they ultimately have to do less manual work and can focus more on doing analysis. It also improves the quality of our work, as the processes become easier to quality assure.

At its most basic level, RAP involves: 


Our approach

We’re fortunate to have a vibrant analytical community of 200 analysts who produce over 130 publications.

In our organisation, we have been pursuing a 2-pronged approach to introducing RAP. Firstly, we’re upskilling our staff to produce RAP independently, supported by a small team of full-time experts from our Data Science Skilled Team.

At the same time, we’re building an end-to-end ‘Halo’ project for GP appointments, where we not only build a RAP pipeline but also automate our interfaces with the website and use our strategic systems as much as possible – more on this below.

RAP isn’t all or nothing – it can be done in stages.

As you would expect, our teams are sharing their work openly as they go, not only to document the code but the process and thinking that we have developed, to help others getting started on adopting RAP in their own work.

RAP isn’t all or nothing – it can be done in stages. We’ve defined the different levels of RAP in our maturity matrix. This makes adopting them more achievable and helps analysts to prioritise which improvements to their code are most important. It’s also important to remember that there is a lot of benefit to just reaching ‘baseline’ RAP. 


Sharing our work

A key focus of ours is to build a community of professionals who share and learn from each other.

We've launched our RAP Community of Practice, with over 30 pages of guidance, and published it externally, to maximise the number of people that can get help on RAP. It includes guidance on coding best practice, functionalising code using Python, following a repository layout, using DevOps principles and more.

Starting with just a few colleagues over the last year, we now have a large group of RAP enthusiasts within our organisation, covering nearly all of our statistical publications. Importantly, we now have 6 publications with their code in the open (please check them out – all feedback is welcome): 

In addition, we have around 16 more that are almost ready to publish. 

We’ve made huge strides, however we know change and learning new ways of working isn’t easy, so we have 'RAP Champions' across the organisation to help guide people down the path. We’re aiming to raise their profile, so our colleagues know help is nearby.

We’re also in contact with the larger community of RAP Champions across the public sector, who hold meetups, share guidance and best practice and want to work with us to improve the quality and reproducibility of our work.

Success isn’t measured by what level of RAP you reach, but instead how much benefit you derive from it.

End-to-end with ‘Halo’

It’s important to see the adoption of RAP not only as a change programme, but also as something that sits within a wider technical landscape. Creating RAP can only be done if you have data in a suitable place (such as an environment where you can run Python and have access to code version-control such as Git), however ’downstream’ needs must also be considered.

In our case, outputs are often consumed by other systems, like content management systems for websites and dashboarding platforms. Consequently, our team established project Halo which focused on not only building RAP on our current strategic data platform, but also uplifting the content management system to enable data to easily populate publications on the website.

The purpose of this was to show a new way of working end-to-end. It has driven out lots of challenges and systems issues, as well as enabling the team to upskill. Valuable lessons are being learnt and we hope to go live with RAP with an automated interface into our website this year.


Sharing our lessons

We are still relatively early on in our RAP journey, but we already have some hard learnt lessons.

Firstly, RAP is not one-size-fits-all. Anything you can do to move your work forward, even a couple of steps, down the path of RAP is great. Success isn’t measured by what level of RAP you reach, but instead how much benefit you derive from it.

For smaller projects, simply making sure it’s half decent code, which is published, might be enough. For other work, such as a family of processes which are run weekly, there might be benefit in going into more detail with RAP, making reusable functions and thoroughly testing those. It’s ok to start simple, and then piece-by-piece work your way up the levels of RAP.

If in doubt, prioritise burden reduction.

Secondly, be pragmatic when it comes to tech. Even in our current setup, it’s not always possible to do every bit of RAP as it ideally should be done – so just do what you can. Even better, also consider contributing to the ongoing process of improving our systems: we need a strong voice for analysts when it comes to developing data platforms.

Finally, if in doubt, prioritise burden reduction. In the short term, a key goal is to free up time for our analysts – so use the tools of RAP to hit the most manual or time-consuming parts of your process first. 

As we make progress, we are looking at refining the content of our RAP Community of Practice Github so that it’s easier to use. We’re adding more guidance, particularly aimed at managers, to help them decide which level of RAP is right for their work and how they can determine if something is getting over-engineered. We are also using the experiences to inform the development of the next iteration of our strategic platform.

If you would like to get involved or find out more check out the Government Analysis Function RAP Strategy and our RAP Community of Practice You’re also welcome to join one of our virtual drop-in sessions (opens in Teams) which is held every Thursday from 10 to 11am to ask questions or see what people are talking about in the world of RAP at NHS Digital.



Related subjects

Alistair Bullward, Product Owner for Open Data and Data Visualisation, provides an update on our open and live data dashboards that are helping to battle COVID-19 and help build a better NHS for the future.

Authors

Last edited: 5 January 2023 8:58 am