Storage on AWS

© 2015 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services, Inc.

Agenda • • • • • •

Storage Primer Block Storage Shared File Systems Object Store On-Premises Storage Integration Structured Data Store

0 Storage Primer

Block vs File vs Object Block Storage Raw Storage Data organized as an array of unrelated blocks Host File System places data on disk e.g.: Microsoft NTFS, Unix ZFS

File Storage Unrelated data blocks managed by a file (serving) system Native file system places data on disk

Object Storage Stores Virtual containers that encapsulate the data, data attributes, metadata and Object IDs API Access to data Metadata Driven, Policy-based, etc

Structured storage - Databases Relational Databases Static Schema Highly structured table organization Rigid data format

Document Store Dynamic Schema Key/Value Database Collection of complex documents Arbitrary, nested data format

Storage - Characteristics Some of the ways we look at storage Durability

Availability

Measure of expected data loss

Measure of expected downtime

Security Security measures in place

Cost Amount per storage unit, e.g. $ / GB

Scalability Upward flexibility

Performance

Integration

Performance metrics

Ability to interact with

AWS has a variety of storage options Amazon EBS (Elastic Block Storage) Amazon Elastic File System (EFS) Amazon EC2 Instance Store (Ephemeral Volumes) Amazon S3 (Simple Storage Service) Amazon Glacier AWS Storage Gateway Amazon Import/Export Snowball

AWS also has a variety of database options Amazon EC2 (Self Managed) Amazon RDS (Relational Database Service)

Amazon DynamoDB Amazon ElastiCache Amazon Redshift

1 Block Storage

Amazon EBS • • • • • •

Persistent block level storage for EC2 Pay only for what you provision Native redundancy and write cache Consistent and low-latency performance Optimized for random I/O Native support for encryption at rest (data volumes)

Amazon EBS • Network attached block device – – – – –

Independent data lifecycle Virtual disks Multiple volumes per EC2 instance Only one EC2 instance at a time per volume Can be detached from an instance and attached to a different one

• Raw block devices – Unformatted block devices – Ideal for databases, filesystems

• Available in multiple types

EBS Volume Types Comparison Magnetic

General Purpose (SSD)

Provisioned IOPS (SSD)

Performance Lowest Cost

Burstable

Predictable

Use Cases

Infrequent Data Access

Boot volumes Small to Medium DBs Dev & Test

I/O Intensive Relational & NoSQL

Media

Magnetic (HDD)

SSD

SSD

Max IOPS

100 on average with the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GB Burstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price

$.05/GB/Month $.05/million I/O

$.10/GB/Month $.125/GB/Month I/O Operations - Free $.065/provisioned IOPS

IOPS Token Bucket Model • Each token represents an “I/O credit” that pays for one read or one write. • A bucket is associated with each General Purpose (SSD) volume, and can hold up to 5.4 million tokens. • Tokens accumulate at a rate of 3 per configured GB per second, up to the capacity of the bucket. • Tokens can be spent at up to 3000 per second per volume. • The baseline performance of the volume is equal to the rate at which tokens are accumulated — 3 IOPS per GB per second.

Magnetic

General Purpose (SSD)

Provisioned IOPS (SSD)

Perform ance

Lowest Cost

Burstable

Predictable

Use Cases

Infrequent Data Access

Boot volumes Small to Medium DBs Dev & Test

I/O Intensive Relational & NoSQL

Media

Magnetic (HDD)

SSD

SSD

Max IOPS

100 on average with the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GB Burstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price

$.05/GB/Month $.05/million I/O

$.10/GB/Month I/O Operations Free

$.125/GB/Month $.065/provisioned IOPS



EBS Provisioned IOPS • EBS Optimized Instances • Dedicated storage throughput

• Predictable Performance • 100-20000 IOPS per volume • Single digit millisecond latency

• Performance Design • Deliver within 10% of PIOPs, 99.9% of the time

Magnetic

General Purpose (SSD)

Provisioned IOPS (SSD)

Perform ance

Lowest Cost

Burstable

Predictable

Use Cases

Infrequent Data Access

Boot volumes Small to Medium DBs Dev & Test

I/O Intensive Relational & NoSQL

Media

Magnetic (HDD)

SSD

SSD

Max IOPS

100 on average with the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GB Burstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price

$.05/GB/Month $.05/million I/O

$.10/GB/Month I/O Operations Free

$.125/GB/Month $.065/provisioned IOPS

Enhanced Throughput for PIOPS & GP2 Volumes • Maximum attainable throughput to each volume was doubled to 128 MB/s read or write traffic • An I/O request of up to 256 KB is now counted as a single I/O operation (IOP) • In many cases you can configure the block size used by your application

• Capable of dramatically reducing your storage costs

Magnetic

General Purpose (SSD)

Provisioned IOPS (SSD)

Perform ance

Lowest Cost

Burstable

Predictable

Use Cases

Infrequent Data Access

Boot volumes Small to Medium DBs Dev & Test

I/O Intensive Relational & NoSQL

Media

Magnetic (HDD)

SSD

SSD

Max IOPS

100 on average with the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GB Burstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price

$.05/GB/Month $.05/million I/O

$.10/GB/Month I/O Operations Free

$.125/GB/Month $.065/provisioned IOPS

Amazon EBS at 20,000 IOPS • Provisioned IOPS (SSD) – Max Volume 16 TB – Max I/O rate 20,000 IOPS – Max throughput 320 MB/s

• General Purpose (SSD) – Max Volume 16 TB – Max I/O rate 10,000 IOPS – Max throughput 160 MB/s

Magnetic

General Purpose (SSD)

Provisioned IOPS (SSD)

Perform ance

Lowest Cost

Burstable

Predictable

Use Cases

Infrequent Data Access

Boot volumes Small to Medium DBs Dev & Test

I/O Intensive Relational & NoSQL

Media

Magnetic (HDD)

SSD

SSD

Max IOPS

100 on average with the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GB Burstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price

$.05/GB/Month $.05/million I/O

$.10/GB/Month I/O Operations Free

$.125/GB/Month $.065/provisioned IOPS

EBS Snapshots AWS Cloud EC2 Availability Zone EBS

EBS

EBS

EBS

EBS

Amazon S3

Create Snapshot

EBS Snapshot

EBS

Clone From Snapshot EC2

EC2

EC2

EBS Snapshot EBS Snapshot EBS Snapshot EBS Snapshot

Internet

How Do Snapshots Work? Time

Snapshot 1

Snapshot 2

Snapshot 3 S3

EBS Volume Block 11 Chunk Block 22 Chunk Block 33 Chunk Block 44 Chunk

EC2 Instance Store (Ephemeral Volumes) • Free with your EC2 instance – SAS and SSD options – Size/type based on instance type

• Local, direct attached resource • Consistent sequential reads and writes • Use only for non-persistent data

2 Shared file system

Elastic File System (EFS) • • • • • • • •

Fully managed file system for EC2 instances Provides standard file system semantics Works with standard operating system APIs Sharable across thousands of instances Elastically grows to petabyte scale Delivers performance for a wide variety of workloads Highly available and durable NFS v4–based

EFS – Mounting EFS

EC2

EC2

EC2

EC2

EC2

EC2

EFS DNS Name availability-zone.file-system-id.efs.aws-region.amazonaws.com

Mount on machine sudo mount -t nfs4 mount-target-DNS:/ ~/efs-mount-point

3 Object Stores

Amazon S3 (Simple Storage Service) • • • • •

Web accessible object store Pay for exactly what you use Highly durable (99.999999999% design) Limitlessly scalable Natively online

• Two flavors: – –

Standard Storage - $0.0300* per GB / mo Standard – Infrequent Access Storage (min size 128KB) – $0.0125* per GB / mo + Data retrieval cost

* (US East (N Virginia) pricing)

Amazon S3 (Simple Storage Service) • Parallel I/O for max speed (Multipart Upload, Ranged GETs)

• • • • • •

Resource-level IAM permissions Bucket Policies & ACLs Direct access through APIs Server Side Encryption Static Website Hosting Data Lifecycle Rules

Amazon Glacier • Low-Cost Archival Storage • Secure •

SSL & AES-256

• Durable •

Designed for 99.999999999% durability

• Optimized for data archiving and backup • •

Suitable for RTO measured in hours Includes storage costs and retrieval costs

• $0.007 per GB/Month (US East pricing) • Integrated with S3

Amazon CloudFront • Easy-to-use Content Delivery Network (CDN) • Pay-as-you-go pricing • Multiple origins: S3, EC2, on-premise • • • • • •

Worldwide network of 53+ edge locations Video streaming Geo Restriction Custom SSL Certificates Dynamic Content POST/PUT

4 On-Premises Storage Integration

AWS Storage Gateway • • •

VM Appliance run on-premise Creates iSCSI volume mount points Directly interfaces with S3 or Glacier

• • •

Gateway-Stored Volumes Gateway-Cached Volumes Virtual Tape Library

Amazon Import/Export Snowball • • • • •

Petabyte scale data transport Uses secure appliances Economic and fast Faster than Internet for significant data sets Import into S3

5 Structured Data Stores

Amazon RDS A fully managed SQL database service Choice of Database engines Simple to deploy and scale Reliable and cost effective Without any operational burden

Amazon Aurora

If you host your databases on-premises App optimization Scaling High availability Database backups DB s/w patches DB s/w installs OS patches OS installation Server maintenance Rack & stack Power, HVAC, net you

If you host your databases on-premises App optimization Scaling High availability Database backups DB s/w patches DB s/w installs OS patches OS installation Server maintenance Rack & stack Power, HVAC, net you

If you host your databases in EC2

App optimization Scaling High availability Database backups DB s/w patches DB s/w installs OS patches you

OS installation Server maintenance Rack & stack Power, HVAC, net

If you host your databases in EC2

App optimization Scaling High availability Database backups DB s/w patches DB s/w installs OS patches you

OS installation Server maintenance Rack & stack Power, HVAC, net

If you choose a managed DB service like RDS Scaling High availability Database backups DB s/w patches DB s/w installs OS patches OS installation Server maintenance

Rack & stack App optimization you

Power, HVAC, net

Traditional Database Architecture Client Tier

one database for all workloads

App/Web Tier

RDBMS

Traditional Database Architecture Client Tier

• • • •

key-value access complex queries transactions analytics

App/Web Tier

RDBMS

Cloud Data Tier Architecture Client Tier

best database for each workload

App/Web Tier

Data Tier Cache

Data Warehouse

NoSQL

RDBMS

Workload Driven Data Store Selection hot reads

analytics

key/value simple query

complex queries & transactions

Data Tier Cache

Data Warehouse

NoSQL

RDBMS

Workload Driven Data Store Selection hot reads

key/value simple query

analytics complex queries & transactions

Data Tier Amazon ElastiCache

Amazon Redshift

Amazon DynamoDB

Amazon RDS

Amazon DynamoDB • Fully managed NoSQL database service • Massively scalable, distributed key/value store • • •

Reserved capacity model Fast and predictable Built-in fault tolerance

• Strong consistency model • Unlimited potential storage and throughput

Amazon ElastiCache • In-memory cache in the cloud • Improve latency and throughput for read-heavy workloads •

Supports open-source caching engines – Memcached – Redis

• Examples – Caching of MySQL database query results – Caching of complex query post-processing results

Amazon Redshift •

Fast and powerful, petabyte-scale data warehouse – – –



Data warehouse-type queries – –



Aggregations, historical analysis BI Tool integration

Grow with your data –



Fully managed Highly-parallel Columnar Data Store

160 GB  1.6 PB

SSD and SAS Options –

SSD provides 10-15x perf @ 5.5x the cost/tb/year

Using Multiple Storage Options Together • EBS + S3: snapshots • S3 + EC2 Instance Store: caching

• S3 + CloudFront: edge caching • S3 + Glacier: data lifecycle archiving • RDS + ElastiCache: cached queries

It’s all about

choice Performance-oriented Cost-oriented

Any Questions?

Storage on AWS.pdf

Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps. ... Storage on AWS.pdf. Storage on AWS.pdf. Open. Extract.

2MB Sizes 7 Downloads 151 Views

Recommend Documents

On modeling the relative fitness of storage
Dec 19, 2007 - This dissertation describes the steps necessary to build a relative fitness ...... retention, and availability (e.g., “What's the disaster recovery plan?

SeDas A Self-Destructing Data System Based on Active Storage ...
SeDas A Self-Destructing Data System Based on Active Storage Framework..pdf. SeDas A Self-Destructing Data System Based on Active Storage Framework..

Energy storage system based on a bidirectional ... - Workrooms Journal
Index Terms— Engineering Education, Energy Storage, One Leg Converter, ... An educational workbench on renewable energy-related power electronics has ...

Energy storage system based on a bidirectional ... - Workrooms Journal
An educational workbench on renewable energy-related power electronics has been proposed ... and hard to anticipate, one can either rely on alternative, non-renewable sources .... Figure 2: Equivalent circuit for calculations, via SchemeIt [3].

A Survey on Network Codes for Distributed Storage - IEEE Xplore
ABSTRACT | Distributed storage systems often introduce redundancy to increase reliability. When coding is used, the repair problem arises: if a node storing ...

Leakage and spillover effects of forest management on carbon storage ...
leakage and spillover, beyond which effects on remote C storage exceed local effects (i.e. U .... based on data of productive temperate and boreal forests in the ...

thermoelectric-energy-storage-based-on-transcritical-co2-cycle.pdf ...
Page 4 of 22. thermoelectric-energy-storage-based-on-transcritical-co2-cycle.pdf. thermoelectric-energy-storage-based-on-transcritical-co2-cycle.pdf. Open.

On modeling the relative fitness of storage (data ...
Nov 27, 2007 - V WorkloadMix model testing on Postmark samples. 251. W WorkloadMix model testing on ... A.4 Bandwidth relative error CDFs: Per-application summary . . . . . . . . . . . . . . . . . . . . 19 ... D.1 Performance graphs: FitnessBuffered

Effect of storage containers and seed treatments on ...
renewable energy in general and biomass energy .... resources was carried out. ... Table 2. Effect of containers and seed treatments on bruchid damage (%) in ...

Storage router and method for providing virtual local storage
Jul 24, 2008 - Technical Report-Small Computer System Interface-3 Generic. PacketiZed Protocol ... 1, 1996, IBM International Technical Support Organization, ..... be a rack mount or free standing device With an internal poWer supply.

Storage-based Intrusion Detection: Watching storage ...
Section 5 describes a prototype storage IDS embedded in an NFS server. Sec- ..... For small numbers of dedicated servers in a machine room, either approach is ...

Unified Storage
Dec 10, 2009 - from any web browser, PC or mobile device. Online and Offline – synchronization ensures information access even without Internet access. Search and Organize – find, tag and create information. Share and Publish – securely share i

Storage of materials
Mar 24, 1992 - (74) Attorney, Agent, or F irmiNeifeld IP Law, PC. (57). ABSTRACT .... 1989”; pp. 234229. Fax from Eric Potter Clarkson to EPO dated Oct. 15, 1999 ..... substances Which in their free states are extremely labile are found to ...

Storage of materials
Aug 28, 2001 - 4,847,090 A * 7/ 1989 Della Posta et al. ...... .. 424/440 ... Water-soluble or sWellable glassy or rubbery composition .... drate Water Systems”.*.

Universal storage management system
Jul 31, 2002 - 1995) pp. 1- 126. Leach, P., et al., “CIFS: A Common Internet File System,” Microsoft ..... glossary and related article on open support) (Special Report: Fault ..... “Encore Sets Agreement with Bell Atlantic Business Systems Ser

Storage router and method for providing virtual local storage
Jul 24, 2008 - CRD-5500, Raid Disk Array Controller Product Insert, pp. 1-5. 6'243'827 ..... Data Book- AIC-1 160 Fibre Channel Host Adapter ASIC (Davies Ex. 2 (CNS ..... devices 20 comprise hard disk drives, although there are numerous ...

Universal storage management system
Jul 31, 2002 - 5,568,628 A 10/1996 $61611 6161. 4,467,421 A ... 10/1997 Yamarnoto et al. .... Veritas. Software Corp. and Veritas Software Global Corporation, Civil ...... Companies, Approaches to RAID and IBM Storage Product Plans;.

3 STORAGE BATIERY
Sep 5, 2008 - storage components typically use a battery-centric topology. This is because the battery provides a stable voltage and ... from a utility grid at unity power factor, and a renewable energy port that sources energy into the .... of the A

Storage of materials
Aug 28, 2001 - Declaration under 37 CFR 1.132 re: Franks et al. Ser. No. 08/241,457 ..... continue it for some hours, for instance 24 to 36 hours. As evaporation ...

Storage Cupboards.pdf
office​ ​furniture​ ​online. office​ ​furniture​ ​sydney. office​ ​partitions. Page 3 of 4. Storage Cupboards.pdf. Storage Cupboards.pdf. Open. Extract.