World’s biggest DNA project is aided by the cloud

16 December 2016

Run by Genomics England, the 100,000 Genomes Project is the largest national DNA sequencing project of its kind in the world. Its aims are to diagnose rare diseases and cancers earlier, identify those who may be susceptible, and to
aid the development of treatments.

Based in Clerkenwell, London, Genomics England is a £300m project owned by the Department of Health. It says it has already delivered diagnoses to some families, transforming the way NHS patients are cared for.

Patients in NHS hospitals are invited, along with their families, to take part anonymously. They attend one of 13 centres to submit five millilitres of blood (or sometimes a tumour sample) from which DNA is extracted for sequencing. 

Dave Brown, head of infrastructure at Genomics England, had three months from the date of his appointment to prepare for the arrival of the first data. There was no IT infrastructure in place and installing equipment would take too long. “With such a short time to deployment, we needed infrastructure that was quick to buy and set up,” says Brown.

After researching cloud providers on the government’s Digital Marketplace, he selected UKCloud.

“The UKCloud platform is based in England, which does away with data residency and protection issues that are often associated with clouds delivered from outside the UK,” says Brown. “It has all the required industry accreditations and certifications, so you know it’s secure. On top of that, it was Pan Government Accredited, which adds yet another layer of assurance.”

Once samples have been analysed at a sequencing centre, the resulting DNA data is sent to the Genomics England platform at UKCloud over secure, dedicated 10Gbps connections using UKCloud’s HybridConnect service. 

In addition, the company hosts and manages specialised storage equipment which is owned by Genomics England and integrated into the cloud environment.

Brown says: “We provided our own storage because of the sheer volume of data that will be generated by the project. The data from a human genome needs around 240Gb of storage; 24Pb will ultimately be needed to store 100,000 genomes.”