IT@Intel Brief Intel IT 64-Bit Computing October 2013

Increasing EDA Throughput with New Intel® Xeon® Processor E5-2600 v2 Product Family • Up to 31.77x increased throughput compared with single-core Intel® Xeon® processor • Up to 10.65x faster compared with dual-core Intel® Xeon® processor 5160

Intel’s silicon design engineers need significant increases in computing capacity to deliver each new generation of silicon chips. To meet those requirements, Intel IT conducts ongoing performance tests, using the latest Intel silicon design data, to analyze the benefits of introducing compute servers based on new, more powerful processors into our electronic design automation (EDA) computing environment. We recently tested a dual-socket server based on the latest Intel® Xeon® processor E5-2680 v2, running single-threaded, multi-threaded, and distributed EDA applications operating on more than 500 Intel silicon design workloads. By utilizing all available cores, the server completed workloads up to 31.77x faster than a server based on a 64-bit Intel® Xeon® processor (3.6 GHz) with a single core, as shown in Figure 1. The server was up to 10.65x faster than a server based on Intel® Xeon® processor 5160 (3.0 GHz) with two cores. Based on our performance assessment, we plan to deploy servers based on the new Intel® Xeon® processor E5-2600 v2 product family this year, continuing our replacement of older servers based on quad-core Intel® Xeon® processor 5400 series and beginning replacement of quad-core Intel® Xeon® processor 5500 series. By doing so, we expect to significantly increase EDA throughput while realizing savings, because we can avoid data center construction and reduce power consumption.

Electronic Design Automation (EDA) Application Performance All Cores Loaded (2004-2013) 32.01

31.61

31.77

30

Higher is Better 25.86

25

64-bit Intel® Xeon® processor with 1 MB L2 cache (3.6 GHz)

24.59

Intel® Xeon® processor 5160 (3.0 GHz) Throughput

20

Intel® Xeon® processor X5365 (3.0 GHz) Intel® Xeon® processor X5460 (3.16 GHz)

15

Intel® Xeon® processor X5570 (2.93 GHz) Intel® Xeon® processor X5675 (3.06 GHz)

10

Intel® Xeon® processor E5-2680 (2.7 GHz) Intel® Xeon® processor E5-2680 v2 (2.8 GHz)

5 0

Simulation 113 jobs

Physical Verification Design Rule Check 240 jobs

Physical Verification Node Antenna Check 240 jobs

Timing Analysis 240 jobs

Figure 1. This graph summarizes EDA test results, comparing relative throughput.

OPC 561 templates

IT@Intel Brief www.intel.com/it avoiding expensive data center construction and achieving operational cost savings due to reduced power consumption.

While our assessments focus on EDA applications, throughput improvements may also be achieved with other applications used in high-performance computing environments where simulation and verification are large parts of the workflow, including:

We ran several tests using industry-leading EDA single-threaded, multi-threaded, and distributed applications comprising more than 500 Intel® processor and chipset design workloads.

• Computational fluid dynamics and simulation in the aeronautical and automobile industries

2006-2008 Intel® Xeon® Processor Single-Core

Intel® Xeon® Processor Dual- or Quad-Core

6.4 GB/s

Intel® Xeon® Processor Dual- or Quad-Core

Intel® Xeon® Processor Quad- or Six-Core

21-25 GB/s

DDR3

Intel® Xeon® Processor Single-Core

2009-2011

Intel® E7520 Chipset

FB-DIMM FB-DIMM

up to 32 GB/s

Intel® 5400 Chipset

32 GB/s per Intel® QPI Link

Intel® Xeon® Processor Eight-Core

DDR3 DDR3 DDR3

DDR3

up to 59.7 GB/s

DDR3 Intel® Xeon® Processor Eight-Core

up to 51.2 GB/s

up to 51.2 GB/s

2013

DDR3

DDR3

up to 32 GB/s

FB-DIMM

2012

DDR3

Intel® 5520 Chipset

DDR3 DDR3 DDR3

DDR3 Intel® Xeon® Processor 10-Core

Intel® Xeon® Processor 10-Core

32 GB/s per Intel® QPI Link

DDR3 DDR3 DDR3

DDR3

Intel® C600 Chipset

Intel® C600 Chipset

2004–2005

2006–2008

2009–2011

2012

2013

Process Technology

90nm

65nm and 45nm

45nm and 32nm

32nm

22nm

Cores per Socket

1

2 or 4

4 or 6

8

10

Cache

1 MB or 2 MB1

4 MB or 6 MB shared between 2 cores

8 MB or 12 MB shared

20 MB shared

25 MB shared

DIMMs

Up to 8

Up to 16

Up to 18

Up to 24

Up to 24

RAM Type

DDR2-400

FB-DIMM/DDR2-667 or FB-DIMM/DDR2-800

DDR3-800/1066/1333 MHz

DDR3-1333/1600 MHz

DDR3-1333/1600/1866 MHz

Maximum Memory Capacity

16 GB

64 GB or 128 GB2

144 GB or 288 GB3

Up to 768 GB4

Up to 1536 GB5

DDR - double data rate; DIMM - dual in-line memory module; FB-DIMM - fully buffered dual in-line memory module; Intel® QPI - Intel® QuickPath Interconnect 1 Data provided only for 1 MB cache; 2 128 GB support with Intel® 5400 Chipset introduced in 2007; 3 144 GB assumes 18 memory slots populated with 8-GB DIMMs; 288 GB assumes 18 memory slots populated with 16-GB DIMMs, and validated only with Intel® Xeon® processor 5600 series; 4 768 GB assumes 24 memory slots populated with 32-GB DIMMs; 5 1536 GB assumes 24 memory slots populated with 64-GB DIMMs

Figure 2. A comparison of dual-socket servers based on Intel® Xeon® processors.

up to 59.7 GB/s

DDR2

21-25 GB/s

6.4 GB/s

FB-DIMM DDR2

25.6 GB/s per Intel® QPI Link

Intel® Xeon® Processor Quad- or Six-Core

DDR3

2004-2005

We ran tests on dual-socket servers based on Intel Xeon processor E5-2680 v2. This processor includes new features designed to increase throughput compared with previous processor generations, including 22nm process technology, up to 10 cores, and up to 25 MB L3 cache. Figure 2 illustrates some of the enhancements that boost EDA application performance.

DDR3

Refreshing older servers also enables us to realize data center cost savings. By taking advantage of the performance and powerefficiency improvements in new server generations, we can increase computing capacity within the same data center footprint,

Test Methodology

DDR3

As design complexity increases, the requirements for compute capacity also increase, so refreshing servers and workstations with higher performing systems is cost-effective and offers a competitive advantage by enabling faster chip design.

• Simulation in the oil and gas industries

Intel IT conducts ongoing performance tests, based on the latest Intel silicon design data, to analyze the potential performance and data center benefits of introducing servers based on new processors into our EDA computing environment.

DDR3

Silicon chip design engineers at Intel face ongoing challenges: integrating more features into ever-shrinking silicon chips, bringing products to market faster, and keeping design engineering and manufacturing costs low.

• Synthesis and simulation applications in the life sciences

DDR3

Background

www.intel.com/it IT@Intel Brief Our goal was to assess throughput improvement by measuring the time taken to complete a specific number of design workloads. To maximize throughput, we configured each application to utilize all available cores, resulting in one job or process per core as shown in Table 1.

Maximizing Throughput with Intel® Hyper-Threading Technology

We then compared our results with previous tests conducted using the same approach on servers based on the following processors:

(Intel® HT Technology) can support up to 40 concurrent software threads

• Single-core 64-bit Intel Xeon processor with 1-MB L2 cache (3.6 GHz), introduced in 2004

performance throughput, as shown in the figure below. Intel HT Technology

• Intel Xeon processor 5160, introduced in 2006

using 2x the application licenses.

Intel® Xeon® processor E5-2680 v2 with Intel® Hyper-Threading Technology in a single two-socket platform. Intel HT Technology can help deliver higher delivered up to a 1.33x benefit when completing the same number of jobs

• Intel® Xeon® processor X5365, introduced in 2007 • Intel® Xeon® processor X5460, introduced in 2007

Comparison of Intel® Xeon® Processor E5-2680 v2 with Intel® HT Technology

• Intel® Xeon® processor X5570, introduced in 2009

Higher is Better 1.33x

• Intel® Xeon® processor X5675, introduced in 2011 • Intel® Xeon® processor E5-2680, introduced in 2012

1.00

Test system configurations are shown in Table 2.

Intel® Xeon® Processor E5-2680 v2 Intel® HT Technology DISABLED ENABLED

Results

Time to Complete 113 Simulation Jobs

Results are shown in Figure 1 and in Tables 1 and 3. The Intel Xeon processor E5-2680 v2-based server completed the tests up to 31.77x faster than a server based on the single-core 64-bit Intel Xeon processor, and up to 10.65x faster than a server based on Intel Xeon processor 5160.

Relative Throughput

Disabled

02:29:23

01:52:16

1.00

1.33

Enabled

Table 1. Electronic Design Automation Summary Test Results Showing Relative Throughput of 64-Bit Intel® Processors Note: Same application binary used across all the platforms 64-bit Intel® Xeon® Processor with 1 MB L2 Cache (3.6 GHz)

Intel® Xeon® Processor 5160 (3.0 GHz)

Intel® Xeon® Processor X5365 (3.0 GHz)

Intel® Xeon® Processor X5460 (3.16 GHz)

Intel® Xeon® Processor X5570 (2.93 GHz)

Intel® Xeon® Processor X5675 (3.06 GHz)

Intel® Xeon® Processor E5-2680 (2.7 GHz)

Intel® Xeon® Processor E5-2680 v2 (2.8 GHz)

SUMMARY TEST RESULTS: RELATIVE THROUGHPUT USING 64-BIT INTEL XEON PROCESSOR WITH 1 MB L2 CACHE AS BASELINE 1.00 3.58 5.65 5.91 12.98 18.63 Simulation (113 Jobs)

25.87

Physical Verification DRC (240 jobs)

1.00

4.24

8.22

9.32

9.89

14.98

20.70

25.86

Physical Verification NAC (240 jobs)

1.00

3.64

6.50

7.50

8.84

12.59

19.66

24.59

32.01

Timing Analysis (240 Jobs)

1.00

4.62

9.90

10.71

11.70

18.15

23.72

31.61

OPC (561 templates)

1.00

2.98

5.00

6.60

11.39

16.73

25.99

31.77

SUMMARY TEST RESULTS: RELATIVE THROUGHPUT USING 64-BIT INTEL XEON PROCESSOR 5160 AS BASELINE Not Applicable (NA) 1.00 1.58 1.65 3.63 Simulation (113 jobs)

5.20

7.22

8.94

Physical Verification DRC (240 jobs)

3.53

4.88

6.09 6.75

NA

1.00

1.94

2.20

2.33

Physical Verification NAC (240 jobs)

NA

1.00

1.79

2.06

2.43

3.46

5.40

Timing Analysis (48 jobs)

NA

1.00

2.14

2.32

2.53

3.93

5.13

6.84

OPC (561 templates)

NA

1.00

1.68

2.21

3.82

5.61

8.71

10.65

DRC - design rule check; NAC - node antenna check; OPC - optical proximity correction

Table 2. Test System Configurations for Dual-Socket Servers Cores

Frequency

Cache

Interconnect

RAM

Memory Type

64-bit Intel® Xeon® Processor

1

3.60 GHz

1 MB

800 MHz Shared FSB

16 GB

DDR2-400

Intel® Xeon® Processor 5160

2

3.00 GHz

4 MB

1333 MHz Dual Independent FSB

16 GB

FB-DIMM/DDR2-667

Intel® Xeon® Processor X5365

4

3.00 GHz

8 MB

1333 MHz Dual Independent FSB

32 GB

FB-DIMM/DDR2-667

Intel® Xeon® Processor X5460

4

3.16 GHz

12 MB

1333 MHz Dual Independent FSB

32 GB

FB-DIMM/DDR2-667

Intel® Xeon® Processor X5570

4

2.93 GHz

8 MB

25.6 GB/s per Intel® QPI Link

48 GB

DDR3-1333∞

Intel® Xeon® Processor X5675

6

3.06 GHz

12 MB

25.6 GB/s per Intel QPI Link

96 GB

DDR3-1333

8

2.70 GHz

20 MB

32.0 GB/s per Intel QPI Link

128 GB

DDR3-1333

10

2.80 GHz

25 MB

32.0 GB/s per Intel QPI Link

256 GB

DDR3-1600

Intel® Xeon® Processor E5-2680 Intel® Xeon® Processor E5-2680 v2

DDR - double data rate; FB-DIMM - fully buffered dual in-line memory module; FSB - front side bus; Intel® QPI - Intel® QuickPath Interconnect ∞ On Intel Xeon processor X5570, DDR3-1333 RAM running at 1066 MHz.

Table 3. Electronic Design Automation Test Results Showing Runtimes and Workload Configurations 64-bit Intel® Xeon® Processor with 1 MB L2 Cache (3.6 GHz)

Intel® Xeon® Processor 5160 (3.0 GHz)

Intel® Xeon® Processor X5365 (3.0 GHz)

Intel® Xeon® Processor X5460 (3.16 GHz)

Intel® Xeon® Processor X5570 (2.93 GHz)

Intel® Xeon® Processor X5675 (3.06 GHz)

Intel® Xeon® Processor E5-2680 (2.7 GHz)

Intel® Xeon® Processor E5-2680 v2 (2.8 GHz)

SIMULATION (113 CPU MODEL TESTS) Number of Simultaneous Jobs

2

4

8

8

8

12

16

20

Total Runtime (hh:mm:ss)

79:41:46

22:15:24

14:06:54

13:28:57

6:08:23

4:16:36

3:04:52

2:29:23

Relative Throughput

1.00

3.58

5.65

5.91

12.98

18.63

25.87

32.01

PHYSICAL VERIFICATION (DESIGN RULE CHECK) Simultaneous 2-Threaded Jobs 1

2

4

4

4

6

8

10

Total Number of Iterations

120

60

60

60

40

30

24

240

Total Number of Jobs

240

240

240

240

240

240

240

240

Total Runtime (hh:mm:ss)

1559:48:00

367:34:00

189:50:00

167:22:00

157:40:00

104:06:40

75:21:00

60:18:24

Relative Throughput

1.00

4.24

8.22

9.32

9.89

14.98

20.70

25.86

PHYSICAL VERIFICATION (NODE ANTENNA CHECK) 2 Simultaneous 2-Threaded Jobs 1

4

4

4

6

8

10

Total Number of Iterations

60

60

60

40

30

24

240

120

Total Number of Jobs

240

240

240

240

240

240

240

240

Total Runtime (hh:mm:ss)

425:44:00

116:54:00

65:28:00

56:47:00

48:09:00

33:49:20

21:39:00

17:18:48

Relative Throughput

1.00

3.64

6.50

7.50

8.84

12.59

19.66

24.59

TIMING ANALYSIS Number of Simultaneous Jobs

2

4

8

8

8

12

16

20

Total Number of Iterations

120

60

30

30

30

20

15

12

Total Number of Jobs

240

240

240

240

240

240

240

240

Total Runtime (hh:mm:ss)

225:12:00

48:44:00

22:45:30

21:01:30

19:15:00

12:24:20

9:29:45

7:07:24

Relative Throughput

1.00

4.62

9.90

10.71

11.70

18.15

23.72

31.61

OPTICAL PROXIMITY CORRECTION (561 TEMPLATES PROCESSING) 4 Number of Simultaneous Jobs 2

8

8

8

12

16

20

Total Runtime (hh:mm:ss)

10:40:12

3:34:39

2:08:04

1:37:03

0:56:11

0:38:16

0:24:38

0:20:09

Relative Throughput

1.00

2.98

5.00

6.60

11.39

16.73

25.99

31.77

Conclusion The new Intel Xeon processor E5-2600 v2 product family delivers significant throughput improvements for Intel design workloads across a range of EDA applications. Using a weighted performance measure of end-to-end EDA applications based on Intel silicon design tests, we found that the effective refresh ratio to replace Intel Xeon processors based on Intel Xeon X5400 series with servers based on the Intel Xeon processor E5-2680 v2 is around 5:1.

Based on our performance assessment and our refresh cycle, we plan to deploy servers based on the new Intel Xeon processor E5-2600 v2 product family this year, completing our replacement of older servers based on quad-core Intel Xeon processor 5400 series and beginning replacement of quad-core Intel Xeon processor 5500 series. By doing so, we expect to achieve greater throughput while realizing operational benefits such as cost avoidance of data center construction and reduced power consumption.

AUTHORS Shesha Krishnapura Senior Principal Engineer, Intel IT Vipul Lal Senior Principal Engineer, Intel IT Ty Tang Senior Principal Engineer, Intel IT Shaji Achuthan Senior Staff Engineer, Intel IT Murty Ayyalasomayajula Staff Engineer, Intel IT

For more straight talk on current topics from Intel’s IT leaders, visit www.intel.com/it.

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers THE INFORMATION PROVIDED IN THIS PAPER IS INTENDED TO BE GENERAL IN NATURE AND IS NOT SPECIFIC GUIDANCE. RECOMMENDATIONS (INCLUDING POTENTIAL COST SAVINGS) ARE BASED UPON INTEL’S EXPERIENCE AND ARE ESTIMATES ONLY. INTEL DOES NOT GUARANTEE OR WARRANT OTHERS WILL OBTAIN SIMILAR RESULTS. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others. Please Recycle 1013/WWES/KC/PDF 329538-001US Copyright © 2013 Intel Corporation. All rights reserved. Printed in USA

Increasing EDA Throughput with New Intel(r) Xeon(r ... - Media13

introducing compute servers based on new, more powerful processors into our electronic design automation (EDA) computing environment. We recently tested a dual-socket server based on the latest Intel® Xeon® processor. E5-2680 v2, running single-threaded, multi-threaded, and distributed EDA applications operating ...

247KB Sizes 1 Downloads 251 Views

Recommend Documents

Increasing EDA Throughput with New Intel(r) Xeon(r ... - Media13
(Intel® HT Technology) can support up to 40 concurrent software threads ... the performance of systems or components they are considering purchasing.

Increasing EDA Throughput with the Intel® Xeon® Processor Scalable ...
requirements, Intel IT conducts ongoing throughput performance tests, using the Intel® silicon design ... threaded EDA applications operating on more than 200 hours of Intel design workloads. By utilizing all ... faster, and keeping design engineeri

Increasing EDA Throughput with the Intel® Xeon® Processor Scalable ...
Δ For more complete information about performance and benchmark results, visit .... Time Needed to Complete 113 Jobs .... Follow us and join the conversation:.

Eda TEREZI.pdf
Whoops! There was a problem loading this page. Eda TEREZI.pdf. Eda TEREZI.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Eda TEREZI.pdf.

Eda TEREZI.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Eda TEREZI.pdf.

New PC Delivery Process Cuts Employee Downtime - Media13
PCs, laptops, and business ultrabook™ devices with Windows* 8 to employees in our enterprise, we are ... person can further customize the OS and services, install additional applications, and store data in different ... Ultrabook device with Window

Atea establishes new private cloud hosting service based ... - Media13
Atea used three HP ProLiant* BL460c servers with the Intel® Xeon® ... means Atea can easily add more capacity, storage and networking ... Atea is the largest reseller in the Nordic market segment for IT infrastructure, with a presence.

Atea establishes new private cloud hosting service based ... - Media13
To create its new hosted private cloud service, Atea sought servers that would ... means Atea can easily add more capacity, storage and networking. • Flexible ...

New PC Delivery Process Cuts Employee Downtime - Media13
personalized computer, including data migration and final ... 10 minutes per system. • Instead of .... their current PC—whether a laptop, desktop, or business ...

Infinite performance - Intel - Media13
Performance testing. Evaluate core applications' performance and scalability when running on the latest Intel® technology. SOLUTIONS. • Processing power.

High throughput DNA sequencing: The new sequencing revolution
Aug 3, 2010 - “cloud computing”[24]. 2.3.3. Improving efficiency and throughput. All companies and sequencing centres regularly update instru- ments ...

High throughput DNA sequencing: The new ...
Aug 3, 2010 - A popular solution with classical sequenc- ... This information is archived in sev- ..... The Arabidopsis genome annotation is archived on TAIR ...... [24] L.D. Stein, The case for cloud computing in genome informatics, Genome ...

Vendor Spotlight Template - Media13
hardware and software technologies from various vendors, with no ... advances and enables companies to produce superior products in shorter timeframes.

Infinite performance - Intel - Media13
quad data rate (QDR) InfiniBand network. TECHNOLOGY ... University of Coimbra evaluates performance and scalability benefits of the latest Intel®technology.

Vendor Spotlight Template - Media13
Data-intensive computing has been an integral part of HPC and other large datacenter ... Hadoop/MapReduce, graph analysis, semantic analysis, and knowledge ... network, security, and advanced storage features make the Intel Xeon ...

Improving data-intensive EDA performance with ...
Mar 21, 2014 - Improving data-intensive EDA performance with annotation-driven laziness. Quirino Zagarese, Gerardo ... Java annotations to program the lazy strategies that guide the framework. • An intense experimentation .... different publishers,

UFIDA - Jinzhou Case Study - Media13
rolled out. SOLUTION BRIEF. Intel® Xeon® Processor E5 Family. Intel® Distribution for Apache Hadoop* Software. Healthcare. Big Data and Cloud Computing ...

Faultless customer service - Intel - Media13
in high season when sales campaigns were in full flow. The custom application which ran the Web portal was developed with two tiers: a front end and a custom ...

High throughput DNA sequencing: The new sequencing revolution
Aug 3, 2010 - NGSTs can be applied to various domains of plant biology, and we identify ...... SNP and InDel markers will be affordable for most crops, thus.

Gneis tackles rising maintenance costs with RISC migration ... - Media13
One of the top five banks in the country ... on whichever platform offers the best price/performance ratio.” Spanish ... ing website was migrated next, achieving.