Early Experience and Evaluation of File  Systems on SSD with Database Applications Yongkun WANG, Kazuo GODA, Miyuki NAKANO, Masaru KITSUREGAWA The University of Tokyo

1

Outline • • • • •

Motivation Flash SSD Basic Performance Study Performance Evaluation by TPC‐C Benchmark Conclusion and Future Work

2

Motivation • Flash SSDs are likely to be used in enterprise storage platforms  for achieving high performance in data‐intensive applications • IO path management techniques should be evaluated carefully – Existing systems are designed for traditional hard disks – IO performance features of flash SSD are different from that of hard disk

• For better utilization of SSDs in DBMS – Evaluate basic performance of SSDs – Evaluate performance of IO path in conventional DBMS • With different file systems and IO schedulers

3

Flash SSD Flash SSD (Solid State Drive)



Performance properties of flash  Memory (Samsung K9XXG08UXM)  – READ (4KB) takes 25us – PROGRAM (4KB) takes 200us – ERASE (256KB) takes 1500us



Erase‐before‐program design can  lead to poor performance in a  normal in‐place‐write system

Flash SSD FTL

Controller Chip Bus

SDRAM Buffer

NAND NAND Flash Flash Flash Memory Memory Memory Chip

Flash Memory Bus

– A package of multiple flash memory  chips – FTL (Flash Translation Layer)  provides block device emulation

SATA Port



NAND NAND Flash Flash Flash Memory Memory Memory Chip

NAND NAND Flash Flash Flash Memory Memory Memory Chip

4

Outline • • • • •

Motivation Flash SSD Basic Performance Study Performance Evaluation by TPC‐C Benchmark Conclusion and Future Work

5

Purpose of Basic Performance Study • Clarify the performance between SSD and HDD • Clarify the performance difference among SSDs • Clarify the erase problem on SSDs

6

Experimental System Dell Precision™ 390 Workstation Dual‐core Intel Core 2 Duo 1.86GHz 2GB Memory SATA 3.0Gbps Controller CentOS 5.2      64‐bit Kernel 2.6.18

Hard Disk (HDD) Hitachi HDS72107,  3.5”, 7200RPM,  32M Cache, 750GB

Flash SSD Mtron PRO 7500 SLC, 3.5” 32GB

Flash SSD Intel X25‐E SLC, 2.5” 64GB

Flash SSD OCZ VERTEX EX SLC, 2.5” 120GB

Inside each device, read‐ahead pre‐fetching and write‐back caching are enabled

Micro Benchmark • One million requests for each case • Request Size: 512B to 256KB • Access patterns – Sequential Read/Write – Random Read/Write – Mixed Random (50% Read plus 50% write)

• Number of outstanding IOs – One outstanding IO: submit one IO request at a time – 30 outstanding IOs: submit 30 IO requests at a time

8

Basic Performance of Flash SSDs IO Throughput [MB/s]

~ Sequential Access ~ 300

300

300

300

250

250

250

250

200

200

200

200

150

150

150

150

100

100

100

100

50

50

50

50

0

0 512 1K 2K 4K 8K 16K 32K 64K128K256K IO Size [bytes]

HDD

0 512 1K 2K 4K 8K 16K 32K 64K128K256K IO Size [bytes]

Mtron

0 512 1K

2K 4K 8K 16K 32K 64K128K256K

512 1K 2K 4K 8K 16K 32K64K128K256K IO Size [bytes]

IO Size [bytes]

Intel

OCZ Read … 100% Write … 100%

• • • •

The read throughput of Intel’s SSD and OCZ’s SSD is much higher The write throughput of Intel’s SSD is higher Write throughput of Intel’s SSD drops quickly after the request size is larger than 32KB The performance gap between read and write throughput of OCZ’s SSD is large 9

Basic Performance of Flash SSDs IO Throughput [K IOPS]

~ Random Access (Single outstanding IO) ~ 20

20

20

20

18

18

18

18

16

16

16

16

14

14

14

14

12

12

12

12

10

10

10

10

8

8

8

8

6

6

6

6

4

4

4

4

2

2

2

2

0

0

0

512 1K 2K 4K 8K 16K 32K 64K128K256K 512 1K IO Size [bytes]

HDD

2K

4K 8K 16K 32K 64K 128K 256K 512 1K IO Size [bytes]

Mtron

0 2K

4K 8K 16K 32K 64K 128K 256K512 1K 2K IO Size [bytes]

Intel

4K 8K 16K 32K 64K128K256K IO Size [bytes]

OCZ 100% ReadRead 100% WriteWrite 50% MixRead 50% Write

• • •

The read IOPS of SSD is much higher than that of HDD. The performance of random write drops drastically on Mtron’s SSD and OCZ’s SSD. The performance of mixed‐access also drops drastically on Mtron’s SSD and OCZ’s  SSD. [Bathtub effect, by Freitas on FAST2010 tutorial] 10

Basic Performance of Flash SSDs IO Throughput [K IOPS]

~ Random Access (30 outstanding IOs) ~ 60

60

60

60

50

50

50

50

40

40

40

40

30

30

30

30

20

20

20

20

10

10

10

10

0

0 512 1K 2K 4K 8K 16K 32K 64K128K256K

0 512 1K 2K 4K 8K 16K 32K 64K128K256K

IO Size [bytes]

IO Size [bytes]

HDD

Mtron

512 1K

0 2K 4K 8K 16K 32K 64K128K256K 512 1K IO Size [bytes]

Intel

2K 4K 8K 16K 32K 64K128K256K IO Size [bytes]

OCZ 100% Read Read (30 outstanding IOs) 100% Read Read (one outstanding IOs)



The read throughput is improved Intel’s SSD and OCZ’ SSD.

11

Basic Performance of Flash SSDs ~ Response Time Distribution of 4KB Random Access ~ 100

90

90

80

80

70

70

60

60

50

50

40

40

30

30

20

20

10

10

0

0

Cumulative Frequency [%]

100

1

100

10000

Response Time [us]

HDD





1000000

100 90 80 70 60 50 40 30 20 10 0 1

100

10000

Response Time [us]

Mtron

1000000

100 90 80 70 60 50 40 30 20 10 1

100

10000

Response Time [us]

Intel

0 1000000 1

100

10000

1000000

Response Time [us]

OCZ 100% ReadRead 100% WriteWrite MixRead 50% Write 50%

Random Read (blue line) • Most of random reads could complete in a very small range of response times  on SSDs Random Write (red line) • The random write behavior is different among three SSDs 12

Outline • • • • •

Motivation Flash SSD Basic Performance Study Performance Evaluation by TPC‐C Benchmark Conclusion and Future Work

13

Purpose of Evaluation By TPC‐C • Provide evaluation on the IO behaviors of SSDs running an  actual database application – Two file systems, two DBMSs and four IO schedulers

• Investigate the detailed behavior of IO path

14

Experimental System Dell Precision™ 390 Workstation Dual‐core Intel Core 2 Duo 1.86GHz 2GB Memory SATA 3.0Gbps Controller CentOS 5.2      64‐bit Kernel 2.6.18

Hard Disk (HDD) Hitachi HDS72107,  3.5”, 7200RPM,  32M Cache, 750GB

Flash SSD Mtron PRO 7500 SLC, 3.5” 32GB

Flash SSD Intel X25‐E SLC, 2.5” 64GB

Flash SSD OCZ VERTEX EX SLC, 2.5” 120GB

Inside each device, read‐ahead pre‐fetching and write‐back caching are enabled

System Configuration • •

Database Application (TPC-C Benchmark)

TPC‐C benchmark 5.10 Database settings

DBMS (MySQL, Commercial DBMS)

– MySQL: InnoDB – Commercial DBMS



– Ext2fs (ext2) – Nilfs2



OS kernel

File system options

IO scheduler – – – –

File System (ext2fs, nilfs2) Kernel Tracer

No operation (Noop) Anticipatory Deadline Completely Fair Queuing (CFQ)

IO Schedulers

Device Driver (SATA)

Disk for OS

HDD for Database

Flash SSDs for Database 16

Configuration of TPC‐C Benchmark • 30 warehouses, with 30 virtual users • “Key and Think” time was 0 • DBMS configuration for TPC‐C benchmark Data buffer size Log buffer size Data block size Data file Synchronous IO Log flushing method 

Commercial DBMS MySQL(InnoDB) 8MB 4MB 5MB 2MB 4KB 16KB fixed, 5.5GB, database size is 2.7GB Yes Yes flushing log at transaction commit

17

File Systems •

data page read write

Ext2fs (ext2) – In‐place update

a b c d

– Seek then read – Seek then update

c •

b

d

Buffer

a

Disk obsolete data page

Nilfs2 – An example of log‐structured file system

– Seek then read – Random writes =>  sequential writes 

a b c d

c

b

d

a

Buffer

a’ b’ c’ d’

Disk 18

Experimental Study • • • • •

Transaction Throughput IO Throughput Buffer Size Workload Property IO Scheduler

19

Transaction Throughput Intel’s SSD is better than HDD. Mtron’s SSD is better than HDD with LFS. OCZ’s SSD is better than HDD with ext2fs. The performance difference is caused by the combination of SSDs and file  system

Transaction Throughput [tpm]

• • • •

14,000 12,000 10,000 8,000

ext2fs

6,000

nilfs2

4,000 2,000 0 HDD Mtron Intel

OCZ

Commercial DBMS

HDD Mtron Intel MySQL

OCZ 20

IO Path Investigation •



Logical IO is captured at the system call  level, where DBMS call the service  routine of OS kernel. Physical IO is captured at the device  driver level, where the IO requests are  sorted and merged, ready to be served  by the device.

Database Application (TPC-C Benchmark) DBMS (MySQL, Commercial DBMS)

OS kernel

Logical IO

File System (ext2fs, nilfs2) Kernel Tracer

IO Schedulers

Physical IO

Device Driver (SATA)

Disk for OS

HDD for Database

Flash SSDs for Database 21

Logical IO Throughput • The transaction throughput follows the results of the logical  IO throughput. Transaction Throughput

Write

nilfs2

12,000 10,000 8,000 6,000 4,000 2,000 0 HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

Commercial DBMS

MySQL

120 100 80 60 40 20 0 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

Read/Write Rate by DBMS [MB/s]

Transaction Throughput [tpm]

14,000

Read

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

ext2fs

Logical IO Throughput

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ 22 MySQL

Commercial DBMS

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

Write

120

100

80

60

40

20

0

HDD Mtron Intel OCZ HDD Mtron Intel OCZ HDD Mtron Intel OCZ

Commercial DBMS MySQL Commercial DBMS HDD Mtron Intel OCZ 23 MySQL

Read/Write Rate to device [MB/s]

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

Read/Write Rate by DBMS [MB/s]

Physical IO Throughput

Logical IO Throughput Physical IO Throughput

Read Write Read

120

100 80

60

40

20

0

Physical IO Throughput (Read) •

Large amount of reads are absorbed by the file system buffer cache.

Logical IO Throughput

Write

120 100 80 60 40 20

Read

120 100 80 60 40 20 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

0

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

0

Read/Write Rate to device [MB/s]

Read

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

Read/Write Rate by DBMS [MB/s]

Write

Physical IO Throughput

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

Commercial DBMS

MySQL

Commercial DBMS

HDD Mtron Intel OCZ 24 MySQL

Physical IO Throughput (Write,ext2fs) • •

Large amount of reads are absorbed by the file system buffer cache. For ext2fs, write throughput are almost the same between logical throughput and physical  throughput. ( Synchronous IO)

Logical IO Throughput

Write

120 100 80 60 40 20

Read

120 100 80 60 40 20 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

0

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

0

Read/Write Rate to device [MB/s]

Read

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

Read/Write Rate by DBMS [MB/s]

Write

Physical IO Throughput

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

Commercial DBMS

MySQL

Commercial DBMS

HDD Mtron Intel OCZ 25 MySQL

Physical IO Throughput (Write, nilfs2) • • •

Large amount of reads are absorbed by the file system buffer cache. For ext2fs, write throughput are almost the same between logical throughput and physical  throughput. ( Synchronous IO) LFS(nilfs2) produces additional writes at the physical IO layer, which has a serious impact on  the overall transaction throughput. Logical IO Throughput

Write

120 100 80 60 40 20

Read

120 100 80 60 40 20 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

0

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

0

Read/Write Rate to device [MB/s]

Read

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2

Read/Write Rate by DBMS [MB/s]

Write

Physical IO Throughput

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

HDD Mtron Intel OCZ

Commercial DBMS

MySQL

Commercial DBMS

HDD Mtron Intel OCZ 26 MySQL

Physical IO Size

Read Size (nilfs2)

Write Size (nilfs2) 180,392

60,000 50,000

186,352

Write Size (ext2fs)

181,115

Read Size (ext2fs)

107,423

The average request size of Physical IO

Average Read/Write Size [bytes]



40,000 30,000 20,000 10,000 0 HDD Mtron Intel

OCZ

Commercial DBMS

HDD Mtron Intel MySQL

OCZ 27

Physical IO Size (HDD, Mtron) The average request size of Physical IO The avg. write size of LFS is much larger than that of ext2fs, which is  beneficial for hard disk and some SSD such as Mtron’s SSD.

HDD 300

IO Throughput [MB/s]

250 200 150 100

Read Size (ext2fs)

Write Size (ext2fs)

50

Read Size (nilfs2)

Write Size (nilfs2)

0

186,352

50,000

181,115

180,392

60,000

107,423

512

1,024

2,048

4,096

8,192

16,384

32,768

65,536

131,072 262,144

65,536

131,072

IO Size [bytes]

Mtron 300

40,000

250 IO Throughput [MB/s]

Average Read/Write Size [bytes]

• •

30,000 20,000 10,000

200 150 100 50

0 HDD Mtron Intel

OCZ

HDD Mtron Intel

OCZ

0 512

1,024

2,048

4,096

8,192

16,384

32,768

IO Size [bytes]

Commercial DBMS

MySQL

28

262,144

Physical IO Size (Intel, OCZ)

Write Size (nilfs2)

60,000 50,000

IO Throughput [MB/s]

250 200 150 100

0 512

1,024

2,048

4,096

8,192

16,384

32,768 65,536 131,072 262,144

IO Size [bytes]

186,352

Read Size (nilfs2)

Intel 300

50

181,115

Write Size (ext2fs)

107,423

Read Size (ext2fs)

180,392

OCZ 300

40,000 250 IO Throughput [MB/s]



The average request size of Physical IO The avg. write size of LFS is much larger than that of ext2fs, which is  beneficial for hard disk and some SSD such as Mtron’s SSD. Large write size is not beneficial on Intel’s and OCZ’s SSD, as shown  in the basic performance study. This helps to explain the inferior  transaction throughput on nilfs2.

Average Read/Write Size [bytes]

• •

30,000 20,000 10,000

200 150 100 50

0 HDD Mtron Intel

OCZ

HDD Mtron Intel

OCZ

0 512

Commercial DBMS

MySQL

1,024

2,048

4,096

8,192

16,384

32,768

IO Size [bytes]

65,536 131,072 262,144

29

Database Buffer Size (Mtron) • The throughput is improved when increasing the buffer size

Commercial DBMS nilfs2

18,000

4,000

16,000

3,500 Transaction Throughput [tpm]

Transaction Throughput [tpm]

ext2fs

MySQL

14,000 12,000 10,000 8,000 6,000 4,000

ext2fs

nilfs2

3,000 2,500 2,000 1,500 1,000 500

2,000 0

0 8M

16M 32M 64M 128M 256M 512M 1G Buffer Size [bytes]

4M

8M 16M 32M 64M 128M 256M 512M 1G 30

Buffer Size [bytes]

Workload Property Measure with three types of workloads Speedup of nilfs2 over ext2fs is increasing when the percentage of read‐ write transactions is increased ext2fs

nilfs2

speedup

% of mix

Transaction  Type

IO Property

read intensive

normal

write intensive

New Order

Read‐Write

4.35

43.48

96.00

Payment

Read‐Write

4.35

43.48

1.00

Delivery

Read‐Write

4.35

4.35

1.00

Stock Level

Read‐Only

43.48

4.35

1.00

Order Status

Read‐Only

43.48

4.35

1.00

Transaction Throughput [tpm]

25,000

12 10

20,000

8 15,000 6 10,000 4 5,000

2

0

0 read normal write intensive intensive

read normal write intensive intensive

Commercial DBMS

MySQL

Buffer Size [bytes] 31

speedup

• •

IO Schedulers • Noop – No operation

• Anticipatory – Merge the IO requests, and re‐order in an  elevation manner

• Deadline – Impose the deadline for each request

• Completely Fair Queuing (CFQ) – Balance the service time of IOs among processes 32

Transaction Throughput with IO Schedulers • IO scheduling does not affect the transaction throughput  largely.

Transaction Throughput [tpm]

Noop

Anticipatory

Deadline

CFQ

25000 20000 15000 10000 5000 0 ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 Mtron

Intel Commercial DBMS

OCZ

ext2fs nilfs2 ext2fs nilfs2 ext2fs nilfs2 Mtron

Intel MySQL

OCZ 33

Conclusion and Future Work • We study the basic performance characteristics of flash SSDs  • We measure and analyze the application performance and the  IO behavior on three flash SSDs and two file systems with TPC‐ C benchmark.  – Transaction Throughput – Logical IO Throughput – Physical IO Throughput

• We plan to study IO path management techniques for  database applications running on flash SSDs.

34

Q&A Thank you very much!

35

Maven/Continuum 411

11. • The read throughput is improved Intel's SSD and OCZ' SSD. Basic Performance of Flash SSDs. ~ Random Access (30 outstanding IOs) ~. HDD. Mtron. Intel.

362KB Sizes 6 Downloads 197 Views

Recommend Documents

Maven/Continuum 411
Number of outstanding IOs. – One outstanding IO: submit one IO request at a time. – 30 outstanding IOs: submit 30 IO requests at a time. 8 .... Seek then update.

Hiebl 411.pdf
PERATURAN DIRJEN DIKTI PEDOMAN OPERASIONAL. Desember 2014. Whoops! There was a problem loading this page. Retrying... Hiebl 411.pdf.

DEPLOYING GOOG-411 - Research
A speech recognition system translates the user's spoken re- quest into a .... Second, we have Google Maps web query logs: This is a large .... [9] S. Ghemawat et al., “The google file system,” in Proc. SIGOPS ... structured data,” in Proc. OSD

DEPLOYING GOOG-411 - Research at Google
searched for quite a few years (see e.g. [3, 4, 5]), and has been implemented in various ... changes the way users interact with the system. To avoid this and focus ... etc), and to track its quality (e.g. what fraction of calls reach a given point i

2K17 Nba Cheat Codes Ps4 411
They were all once published .... Generator Nba 2K16 Live Free Game Generator Codes on Android phone, Free Game Generator Codes Vc Locker Codes Nba ...

Read 70-411 Administering Windows Server 2012 R2 ...
Solutions Associate (MCSA): Windows Server 2012 certification. ... Academic Course is mapped to the 70-411 Administering Windows Server 2012 exam ... services such as user and group management, network access, and data security. In.

eBook Download Exam Ref 70-411 Administering ...
... online pdf Exam Ref 70-411 Administering Windows Server 2012 R2 (MCSA), pdf .... critical-thinking and decision-making acumen needed for success at the ...

Download 70-411 Administering Windows Server 2012 ...
Academic Course is mapped to the 70-411 Administering Windows Server 2012 exam ... Management The MOAC IT Professional series is the Official from Microsoft, ... certification and was authored for college instructors and college students.