Storage Architecture and Challenges

Faculty Summit, July 29, 2010 Andrew Fikes, Principal Engineer [email protected]

Introductory Thoughts Google operates planet-scale storage systems What keeps us programming: Enabling application developers Improving data locality and availability Improving performance of shared storage A note from the trenches: "You know you have a large storage system when you get paged at 1 AM because you only have a few petabytes of storage left."

The Plan for Today Storage Landscape Storage Software and Challenges Questions (15 minutes)

Storage Landscape: Hardware A typical warehouse-scale computer: 10,000+ machines, 1GB/s networking 6 x 1TB disk drives per machine What has changed: Cost of GB of storage is lower Impact of machine failures is higher Machine throughput is higher What has not changed: Latency of an RPC Disk drive throughput and seek latency

Storage Landscape: Development Product success depends on: Development speed End-user latency Application programmers: Never ask simple questions of the data Change their data access patterns frequently Build and use APIs that hide storage requests Expect uniformity of performance Need strong availability and consistent operations Need visibility into distributed storage requests

Storage Landscape: Applications Early Google: US-centric traffic Batch, latency-insensitive indexing processes Document "snippets" serving (single seek) Current day: World-wide traffic Continuous crawl and indexing processes (Caffeine) Seek-heavy, latency-sensitive apps (Gmail) Person-to-person, person-to-group sharing (Docs)

Storage Landscape: Flash (SSDs) Important future direction: Our workloads are increasingly seek heavy 50-150x less expensive than disk per random read Best usages are still being explored Concerns: Availability of devices 17-32x more expensive per GB than disk Endurance not yet proven in the field

Storage Landscape: Shared Data Scenario: Roger shares a blog with his 100,000 followers Rafa follows Roger and all other ATP players Rafa searches all the blogs he can read To make search fast, do we copy data to each user? YES: Huge fan-out on update of a document NO: Huge fan-in when searching documents To make things more complicated: Freshness requirements Heavily-versioned documents (e.g. Google Wave) Privacy restrictions on data placement

Storage Landscape: Legal Laws and interpretations are constantly changing Governments have data privacy requirements Companies have email and doc. retention policies Sarbanes-Oxley (SOX) adds audit requirements Things to think about: Major impact on storage design and performance Are these storage- or application-level features? Versioning of collaborative documents

Storage Software: Google's Stack Tiered software stack Node Exports and verifies disks Cluster Ensures availability within a cluster File system (GFS/Colossus), structured storage (Bigtable) 2-10%: disk drive annualized failure rate Planet Ensures availability across clusters Blob storage, structured storage (Spanner) ~1 cluster event / quarter (planned/unplanned)

Storage Software: Node Storage Purpose: Export disks on the network Building-block for higher-level storage Single spot for tuning disk access peformance Management of node addition, repair and removal Provides user resource accounting (e.g. I/O ops) Enforces resource sharing across users

Storage Software: GFS The basics: Our first cluster-level file system (2001) Designed for batch applications with large files Single master for metadata and chunk management Chunks are typically replicated 3x for reliability GFS lessons: Scaled to approximately 50M files, 10P Large files increased upstream app. complexity Not appropriate for latency sensitive applications Scaling limits added management overhead

Storage Software: Colossus Next-generation cluster-level file system Automatically sharded metadata layer Data typically written using Reed-Solomon (1.5x) Client-driven replication, encoding and replication Metadata space has enabled availability analyses Why Reed-Solomon? Cost. Especially w/ cross cluster replication. Field data and simulations show improved MTTF More flexible cost vs. availability choices

Storage Software: Availability Tidbits from our Storage Analytics team: Most events are transient and short (90% < 10min) Pays to wait before initiating recovery operations Fault bursts are important: 10% of faults are part of a correlated burst Most small bursts have no rack correlation Most large bursts are highly rack-correlated Correlated failures impact benefit of replication: Uncorrelated R=2 to R=3 => MTTF grows by 3500x Correlated R=2 to R=3 => MTTF grows by 11x source: Google Storage Analytics team D.Ford, F.Popovici, M.Stokely, and V-A. Truong, F. Labelle, L. Barroso, S. Quinlan, C. Grimes

Storage Software: Bigtable The basics: Cluster-level structured storage (2003) Exports a distributed, sparse, sorted-map Splits and rebalances data based on size and load Asynchronous, eventually-consistent replication Uses GFS or Colossus for file storage The lessons: Hard to share distributed storage resources Distributed transactions are badly needed Application programmers want sync. replication Users want structured query language (e.g. SQL)

Storage Challenge: Sharing Simple Goal: Share storage to reduce costs Typical scenario: Pete runs video encoding using CPU & local disk Roger runs a MapReduce that does heavy GFS reads Rafa runs seek-heavy Gmail on Bigtable w/ GFS Andre runs seek-heavy Docs on Bigtable w/ GFS Things that go wrong: Distribution of disks being accessed is not uniform Non-storage system usage impacts CPU and disk MapReduce impacts disks and buffer cache GMail and Buzz both need hundreds of seeks NOW

Storage Challenge: Sharing (cont.) How do we: Measure and enforce usage? Locally or globally? Reconcile isolation needs across users and systems? Define, implement and measure SLAs? Tune workload dependent parameters (e.g. initial chunk creation)

Storage Software: BlobStore The basics: Planet-scale large, immutable blob storage Examples: Photos, videos, and email attachments Built on top of Bigtable storage system Manual, access- and auction-based data placement Reduces costs by: De-duplicating data chunks Adjusting replication for cold data Migrating data to cheaper storage Fun statistics: Duplication percentages: 55% - Gmail, 2% - Video 90% of Gmail attach. reads hit data < 21 days old

Storage Software: Spanner The basics: Planet-scale structured storage Next generation of Bigtable stack Provides a single, location-agnostic namespace Manual and access-based data placement Improved primitives: Distributed cross-group transactions Synchronous replication groups (Paxos) Automatic failover of client requests

Storage Software: Data Placement End-user latency really matters Application complexity is less if close to its data Countries have legal restrictions on locating data Things to think about: How do we migrate code with data? How do we forecast, plan and optimize data moves? Your computer is always closer than the cloud.

Storage Software: Offline Access People want offline copies of their data Improves speed, availability and redundancy Scenario: Roger is keeping a spreadsheet with Rafa Roger syncs copy to his laptop and edit Roger wants to see data on laptop from phone Things to think about: Conflict resolution increases application complexity Offline codes is often very application specific Do users really need peer-to-peer synchronization?

Questions Round tables at 4 PM: Using Google's Computational Infrastructure Brian Bershad & David Konerding Planet-Scale Storage Andrew Fikes & Yonatan Zunger Storage, Large-Scale Data Processing, Systems Jeff Dean

Additional Slides

Storage Challenge: Complexity Scenario: Read 10k from Spanner 1. Lookup names of 3 replicas 2. Lookup location of 1 replica 3. Read data from replicas 1. Lookup data locations from GFS 2. Read data from storage node 1. Read from Linux file system Layers: Generate API impedence mismatches Have numerous failure and queuing points Make capacity and perf. prediction super-hard Make optimization and tuning very difficult

Storage Software: File Transfer Common instigators of data transfer: Publishing production data (e.g. base index) Insufficient cluster capacity (disk or CPU) System and software upgrades Moving data is: Hard: Many moving parts, and different priorities Expensive & time-consuming: Networks involved Our system: Optimized for large, latency-insensitive networks Uses large windows and constant-bit rate UDP Produces smoother flow than TCP

Storage Architecture and Challenges Cloud Platform

Jul 29, 2010 - A typical warehouse-scale computer: 10,000+ ... 2-10%: disk drive annualized failure rate .... Roger wants to see data on laptop from phone.

83KB Sizes 7 Downloads 335 Views

Recommend Documents

Google Cloud Storage Cloud Platform
Store application data Google Cloud Storage provides fast access to application data, such as images for a photo editing app. • Share data with colleagues and ...

Hitachi virtual storage platform architecture guide
There was a problem loading more pages. Retrying... Hitachi virtual storage platform architecture guide. Hitachi virtual storage platform architecture guide. Open.

WebFilings Cloud Platform
The mission is to help companies find new ways to reduce the time, risk, and ... Solution. As the development team worked to create the software they envisioned, ... WebFilings customers say they have filed their quarterly 10-Qs a week earlier.

Certificate Cloud Platform
Apr 15, 2016 - Sites API. • Sheets API. • Apps Activity API. Google Apps Admin SDK APIs: • Admin Settings API. • Domain Shared Contacts API. • Directory API.

Gigya Cloud Platform
Gigya enables its customers to integrate social media into their website applications through ... One of Gigya's most popular apps lets customers enhance live.

Untitled Cloud Platform
Page 1. Updated document version now lives in

Certificate Cloud Platform
Apr 15, 2016 - the Information Security Management System as defined and implemented by located in Mountain View, California, United States of America,.

kahuna Cloud Platform
Google App Engine, a Google Cloud Platform service, provided the scalability they needed. A platform to handle size. Kahuna's customer engagement engine ...

Google Cloud and Australian Privacy Principles Cloud Platform
Principles (APP), regulates the way organisations and government agencies handle the personal ... Direct marketing. 8. Cross-border disclosure of personal information. 9. Adoption, use or disclosure of government related identifiers. 10. Quality of p

There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. CLOUD ...

G Suite Cloud Platform
Barrow Street. Dublin 4. 30 December 2016. Re: Application for a common opinion regarding Google Apps (now G-Suite utilisation of model contract clauses.

D3.2 Cloud Platform v2 - NUBOMEDIA
Jan 27, 2015 - NUBOMEDIA: an elastic Platform as a Service (PaaS) cloud ..... 4.1.1 Network Service Record (NSR) deployment sequence diagram . ...... 3 ...

Interactions Marketing Cloud Platform
solutions, the company focused on Google BigQuery. With previous ... Interactions worked closely with Google and software company Tableau while conducting ...

News Limited Cloud Platform
customers in just 3 weeks. • Published five ... testing within two to three months ... A mix of either field sales teams, call centre agents, or basic online tools. Ads .... solution. “We've fundamentally changed the way consumers engage with.

MAG Interactive Cloud Platform
Build Ruzzle for both Android and iOS ... Sell premium Android version through .... Ruzzle saw rapid growth at launch, and is currently handling over 10M.

Pocket Gems Cloud Platform
“We're really excited about the Android platform,” Crystal says. “I'm hopeful that the Tap series will become one of the most popular Android apps, too.

Google Cloud Platform Services
Dec 21, 2017 - Because the circumstances and types of deployments in GCP can range so ... with the ability to manage the Cloud Platform and other Google ... network services and security features—such as routing, firewalling, ... storage system, Da

D3.3 Cloud Platform v3 - NUBOMEDIA
Apr 5, 2017 - NUBOMEDIA: an elastic PaaS cloud for interactive social multimedia. 2 ..... while the Media Service components are deployed on the IaaS using the NFV layers. ...... defined as Network Service (refer to section 2.3.3 for more details), t

SOC 3 Cloud Platform
Jul 29, 2016 - Confidentiality. For the Period 1 May 2015 to 30 April 2016 ... Google Cloud Platform, and Other Google Services System ..... virtual machines on-demand, manage network connectivity using a simple but flexible networking.

Google Cloud Platform Services
Dec 21, 2017 - Platform, nor have we considered the impact of any security concerns on a specific workflow or piece of software. The assessment ... similar to a traditional file system, including fine-grained access control lists for each object. ...

A Secured Cost-effective Multi-Cloud Storage in Cloud Computing ...
service business model known as cloud computing. Cloud data storage redefines the security issues targeted on customer's outsourced data (data that is not ...

Google Cloud VPN Interop Guide Cloud Platform
Google Cloud VPN service​. This information is ... authentication. Finally, enter the IP range of the Cisco ASA ​inside network​under ​Remote network IP ranges​: .... crypto map gcp-vpn-map 1 set ikev2 ipsec-proposal gcp crypto map ...