Case Study | Google Cloud Storage

Company Helps Researchers Unravel DNA Mysteries Faster with Google Cloud Storage

At a Glance What they wanted to do • Build a secure mirrored version of the NCBI SRA repository • Develop a web-based user-friendly interface • Allow for scalability and future data growth What they did • Used Google Cloud Storage to host 350 terabytes of DNA sequencing data • Focused resources on developing the user interface, since Google handles security, scalability and other issues What they accomplished • Mirrored a comprehensive cloud-based DNA archive using minimal internal resources • Created a platform that has received praise from researchers on how easy it is to use • Help expedite research by allowing scientists to sort through and pinpoint specific DNA sequencing data more easily 

Organization DNAnexus is a Mountain View, Calif.-based company focused on unlocking the potential of DNA-based medicine and biotechnology with a collaborative and scalable data technology platform built on the cloud. One aspect of their work is focused on making large genomic datasets broadly accessible to the research community and coupling them with sophisticated analysis tools. The company recently worked with Google Cloud Storage to develop a mirror of the National Center for Biotechnology Information’s Sequence Read Archive (SRA), a public repository of DNA sequencing data from some of the world’s leading research institutions. This project provides a complementary way for researchers to freely access these important data and exemplifies how cloud-based technologies are enabling completely new approaches to large-scale data access and analysis. Challenge “The SRA is an important resource for the research community and we felt that we could leverage our expertise in developing easy-to-use data analysis tools to help preserve and enhance its usability,” says Brigitte Ganter, Ph.D., Director of Product Marketing at DNAnexus. Along with replicating the SRA data and maintaining free access, DNAnexus was focused on improving the overall experience of using and accessing these data. To accomplish this, they needed a large-scale data storage solution capable of hosting more than 350 terabytes of data and robust enough to handle growth and large downloads. They also wanted to completely re-engineer the way that users interacted with the data, mined results and downloaded datasets of interest.

“Rather than building your own infrastructure and taking time and resources away from your company, you can use Google’s infrastructure and know that it’s scalable and secure.” —Brigitte Ganter, Director of Product Marketing, DNAnexus

Solution DNAnexus began this project in June 2011. To host the massive SRA data set, the company looked to Google Cloud Storage, a service that lets companies store their data in Google’s cloud. DNAnexus knew the service provided the reliability, security and vast scalability needed to support the mirrored SRA site.

About Google Cloud Storage

Google Cloud Storage allows companies to store and access their data in Google’s highly scalable storage and networking infrastructure. Developers can store objects of any size and manage access to their data on an individual or group basis. For more information visit www.code.google.com/apis/storage “We’re able to have our site up and running 24 hours a day, seven days a week. We don’t have to manage and maintain servers.” —Brigitte Ganter, Director of Product Marketing, DNAnexus

With Google’s help, the team downloaded approximately 100,000 files from the NCBI SRA website and converted them into the more popular FASTQ format, which would be easier for researchers to work with than the standard SRA format. They then uploaded both versions of the files to Google Cloud Storage. “Because Google Cloud Storage is highly scalable, we’re able to provide the data in both formats, which provides tremendous value to researchers who are more familiar with using FASTQ files,” Ganter says. With Google hosting the data, the new SRA site powered by DNAnexus – which launched in October 2011 – has required very minimal maintenance on DNAnexus’ part, Ganter adds. This has freed up the company to focus on enhancing the user interface so researchers can more easily search the database, filter results and pinpoint the specific data they are interested in investigating further. DNAnexus provides an instant online genomics data and analysis center where researchers can upload SRA datasets and access a suite of tools for performing additional analyses. Results Google Cloud Storage provided DNAnexus with the robust data capacity it needed to build a mirrored version of the NCBI SRA repository and removed the administrative burden of managing these large datasets. “There’s so much we take for granted with Google Cloud Storage,” Ganter says. “We’re able to have our site up and running 24 hours a day, seven days a week. We don’t have to manage and maintain servers.” Ganter says that the DNAnexus team members spend practically no time managing the system, with the exception of overseeing incremental updates. Researchers around the world are now able to search for and access SRA datasets through an intuitive, web-based user-friendly interface. The mirrored website earned immediate praise from the research community for its ease of use; as word about the site spreads, Ganter is confident that Google Cloud Storage can accommodate any spikes in usage. “If you have big data, you want to work with someone who truly understands the unique challenges of working on the terabyte scale,” she says. “Rather than building your own infrastructure and taking time and resources away from your company, you can use Google’s infrastructure and know that it’s scalable and secure.”

© 2012 Google Inc. All rights reserved. Google and the Google logo are trademarks of Google Inc. All other company and product names may be trademarks of the respective companies with which they are associated. SS1054-1202

Company Helps Researchers Unravel DNA ... Cloud Platform

and scalable data technology platform built on the cloud. One aspect of their work is focused on making large genomic datasets broadly accessible.

906KB Sizes 4 Downloads 212 Views

Recommend Documents

Startup Helps Zap Mobile App Errors with ... Cloud Platform
Organization. BugSense, an application error-reporting service, relies on Google. App Engine to track and report millions of app errors every day. When.

RainToday.com - Webinar Campaign Helps Global Tech Company ...
Webinar Campaign Helps Global Tech Company Expand Accounts and Generate ... campaign management software, the team also developed landing pages ...

WebFilings Cloud Platform
The mission is to help companies find new ways to reduce the time, risk, and ... Solution. As the development team worked to create the software they envisioned, ... WebFilings customers say they have filed their quarterly 10-Qs a week earlier.

Certificate Cloud Platform
Apr 15, 2016 - Sites API. • Sheets API. • Apps Activity API. Google Apps Admin SDK APIs: • Admin Settings API. • Domain Shared Contacts API. • Directory API.

Gigya Cloud Platform
Gigya enables its customers to integrate social media into their website applications through ... One of Gigya's most popular apps lets customers enhance live.

Untitled Cloud Platform
Page 1. Updated document version now lives in https://developers.google.com/appengine/pdf/HowtofileaGESCsupportcase.pdf.

Certificate Cloud Platform
Apr 15, 2016 - the Information Security Management System as defined and implemented by located in Mountain View, California, United States of America,.

kahuna Cloud Platform
Google App Engine, a Google Cloud Platform service, provided the scalability they needed. A platform to handle size. Kahuna's customer engagement engine ...

Google Cloud Storage Cloud Platform
Store application data Google Cloud Storage provides fast access to application data, such as images for a photo editing app. • Share data with colleagues and ...

G Suite Cloud Platform
Barrow Street. Dublin 4. 30 December 2016. Re: Application for a common opinion regarding Google Apps (now G-Suite utilisation of model contract clauses.

D3.2 Cloud Platform v2 - NUBOMEDIA
Jan 27, 2015 - NUBOMEDIA: an elastic Platform as a Service (PaaS) cloud ..... 4.1.1 Network Service Record (NSR) deployment sequence diagram . ...... 3 https://www.openstack.org/assets/pdf-downloads/Containers-and-OpenStack.pdf ...

Interactions Marketing Cloud Platform
solutions, the company focused on Google BigQuery. With previous ... Interactions worked closely with Google and software company Tableau while conducting ...

News Limited Cloud Platform
customers in just 3 weeks. • Published five ... testing within two to three months ... A mix of either field sales teams, call centre agents, or basic online tools. Ads .... solution. “We've fundamentally changed the way consumers engage with.

MAG Interactive Cloud Platform
Build Ruzzle for both Android and iOS ... Sell premium Android version through .... Ruzzle saw rapid growth at launch, and is currently handling over 10M.

Pocket Gems Cloud Platform
“We're really excited about the Android platform,” Crystal says. “I'm hopeful that the Tap series will become one of the most popular Android apps, too.

Google Cloud Platform Services
Dec 21, 2017 - Because the circumstances and types of deployments in GCP can range so ... with the ability to manage the Cloud Platform and other Google ... network services and security features—such as routing, firewalling, ... storage system, Da

D3.3 Cloud Platform v3 - NUBOMEDIA
Apr 5, 2017 - NUBOMEDIA: an elastic PaaS cloud for interactive social multimedia. 2 ..... while the Media Service components are deployed on the IaaS using the NFV layers. ...... defined as Network Service (refer to section 2.3.3 for more details), t

SOC 3 Cloud Platform
Jul 29, 2016 - Confidentiality. For the Period 1 May 2015 to 30 April 2016 ... Google Cloud Platform, and Other Google Services System ..... virtual machines on-demand, manage network connectivity using a simple but flexible networking.

Google Cloud Platform Services
Dec 21, 2017 - Platform, nor have we considered the impact of any security concerns on a specific workflow or piece of software. The assessment ... similar to a traditional file system, including fine-grained access control lists for each object. ...

Attribution modeling helps an electronics company reduce ...
BACKGROUND. Our client was a leading brand in Consumer Electronics & Computers. Having invested heavily in paid search, display & video advertising, they ...

Google Cloud and Australian Privacy Principles Cloud Platform
Principles (APP), regulates the way organisations and government agencies handle the personal ... Direct marketing. 8. Cross-border disclosure of personal information. 9. Adoption, use or disclosure of government related identifiers. 10. Quality of p

Google Cloud VPN Interop Guide Cloud Platform
Google Cloud VPN service​. This information is ... authentication. Finally, enter the IP range of the Cisco ASA ​inside network​under ​Remote network IP ranges​: .... crypto map gcp-vpn-map 1 set ikev2 ipsec-proposal gcp crypto map ...

Google Cloud VPN Interop Guide Cloud Platform
the ​Google Cloud VPN service​. .... Create two firewall policies, one for Google Cloud Platform network ingress to the 300C local subnets, and one for 300C ...