FLORIDA INTERNATIONAL UNIVERSITY

Hardware-aided Monitoring of L1 and L2 D-Cache Misses in SMT Lichen Weng and Chen Liu Electrical and Computer Engineering Department {lichen.weng, chen.liu}@fiu.edu The OLS regression can be implemented for every thread and then conduct instruction fetching (single thread to illustrate in the figure).

Where are we? Simultaneous Multithreading (SMT) architectures are defined as fully shared execution resources among several concurrently running threads in the same core [1].

Long-latency load is one of the major obstacle to better performance as the expression of Memory Wall in the SMT architectures[2]: I

II

Prioritization

Fetching

• It has to fetch data from lower memory architectures

III

• It still holds the shared resources, e.g., ReOrder Buffer, for hundreds of cycles during such fetching

IV

• Resource efficiency is harmed because the shared resources are held without throughput

V

• Task Level Parallelism (TLP) is reduced because other threads cannot utilize such shared resources

VI

Regression

• A load misses in the Level 2 Data Cache

Two-level cache misses are sampled in Sampling Period, i.e., certain CPU cycles

Certain samples (Window Size) are utilized for OLS regression

The model evaluates future L2 cache miss based on immediate L1 cache miss rate for every thread

The priority descends as evaluated L2 miss rate grows

Fetch from the thread with highest priority then the second, and so on so forth

• Therefore, system performance is decreased

What did we achieve?

Fetch policy, which assigns the priority in fetch stage is used to manage the shared resources and handle long-latency load issue. STALL[2]

DG[3]

DWarn[4]

L2 D-Cache Miss

L1 D-Cache Miss

L1 D-Cache Miss

Linearity confirmation • F values are used to test the linearity between L1 and L2 cache miss rate for various benchmarks, which confirms its significance

Performance improvement Action Timing

Action Suspend the thread Suspend the thread Reduce the thread priority

• It adaptively minimizes the influence of long-latency load, because it utilizes updated statistical model • It achieves higher resource efficiency, because it reduces priority rather than gates threads

Sensitivity analysis • Larger sampling period leads to better performance • Larger L2 cache size means more throughput

The relationship between L1 and L2 cache misses is more complicated than it is assumed. gzip

L1 cache L2 cache miss rate miss rate

L1 cache miss rate

1.0000

-0.1792

L2 cache miss rate

-0.1792

1.0000

What do we propose? During an interval, the Ordinary Least Square (OLS) regression can be employed to describe the relationship, considering knowledge about L2 miss in advance will benefit the system. The β = 0.365134 and α=0.0003677 are from the OLS regression for the benchmark apsi. The linearity between L1 and L2 cache miss is statistically modeled.

Who did we reference? [1] D.M. Tullsen, S.J. Eggers, J.S. Emer, H.M. Levy, J.L. Lo and R.L. Stamm, “Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor”. ISCA, 1996. [2] D.M. Tullsen and J.A. Brown, “Handling long-latency loads in a simultaneous multithreading processor”. ISCA, 2001. [3] A. El-Moursy and D.H. Albonesi, “Front-end policies for improved issue efficiency in SMT processors”. HPCA, 2003. [4] F.J. Cazorla, A. Ramirez, M. Valero and E. Fernandez, “DCache warn: an I-fetch policy to increase SMT efficiency”. IPDPS, 2004. [5] T.T. Soong, “Fundamentals of probability and statistics for engineers”. John Wiley and Sons, Ltd, 2004

Hardware-aided Monitoring of L1 and L2 D-Cache ...

Hardware-aided Monitoring of L1 and L2 D-Cache Misses in SMT ... Long-latency load is one of the major obstacle to better performance ... [3] A. El-Moursy and D.H. Albonesi, “Front-end policies for improved issue efficiency in SMT.

542KB Sizes 3 Downloads 114 Views

Recommend Documents

Hardware-aided Monitoring of L1 and L2 D-Cache Misses in SMT
processor”. ISCA, 1996. [2] D.M. Tullsen and J.A. Brown, “Handling long-latency loads in a simultaneous multithreading processor”. ISCA, 2001. [3] A. El-Moursy ...

Hardware-aided Monitoring of L1 and L2 D-Cache ...
Page 1. FLORIDA INTERNATIONAL UNIVERSITY. Hardware-aided Monitoring of L1 and L2 D-Cache Misses in SMT. Lichen Weng and Chen Liu. Electrical and ...

BIDIRECTIONAL CROSSLINGUISTIC INFLUENCE IN L1-L2 ...
a+ Verb types: climb, crawl, creep, roll, run, slither, squeeze, swing b+ Adverbial types: like Tarzan. Bidirectional Influence in Speech and Gesture. 251.

BIDIRECTIONAL CROSSLINGUISTIC INFLUENCE IN L1-L2 ...
tures might provide an additional window through which cross linguistic influence can be observed, particularly for speakers whose speech sounds targetlike ~see Gullberg, 2008, for an overview; Kellerman ...... Increasing native English vocabulary re

Definiteness: from L1 Mandarin to Mandarin L2 English
with indefinites as in (9): (9) Harry smoked 5 cigarettes, so now he only has 3 left. Examples of the Mandarin .... mentioned laptop)). These data indicate that ...

Solaris L2
Custom jumpstart installation. Domain Naming ... Types of RAIDS(hardware & software ). • overview of state database and state database replicas. • Creating ...

Solaris L2
Custom jumpstart installation. Domain Naming Service(DNS): ... Introduction to SVM. • Advantages of volume manager. • Types of RAIDS(hardware & software ).

L1.pdf
Sign in. Page. 1. /. 1. Loading… Page 1 of 1. Page 1 of 1. L1.pdf. L1.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying L1.pdf. Page 1 of 1.

L1 Intro.pdf
3. Manufacturing System. 4. Manufacturing categories. 5. Examples of Manufacturing Industries. 1. Manufacturing big players. 6. Importance of Manufacturing. 1.

L2/10-436 - Unicode.org
As you are aware, International Forum for Information Technology in Tamil (INFITT - ...... 3தம, ba – 3 as in க3டம, rha – à®± as in மறம. Thus we don't need ...

L2/10-436 - Unicode.org
George L. Hart, University of California, Berkeley has explained the uniqueness ... stand as one of the great classical traditions and literatures of the world.

Lead_DC_Env_Exposure_Detection-Monitoring-Investigation-of ...
... of the apps below to open or edit this item. Lead_DC_Env_Exposure_Detection-Monitoring-Investig ... l-and-Chronic-Diseases-regulations(6CCR1009-7).pdf.

l1 00pm moms / MOLECULE":
Jun 16, 1997 - plished by a number of known deposition techniques. The energy pulse may be either that of a pulsed laser or of a pulsed ion-beam source.

L2-Satellite Geodesy.pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. L2-Satellite Geodesy.pdf. L2-Satellite Geodesy.pdf. Open. Extract.

rel-l1.pdf
ostru vs. cavasāo el. Schrs dirt r covariome sob as transformed as du. Galleu. (meo Podu ri a Luxor du st he or ic du scrite pelo. cauais d. Schröd in tr neo-rla -.

Stevia plant named 'AKH L1'
Jan 18, 2011 - US PP23,164 P3. Nov. 6, 2012. (10) Patent N0.: (45) Date of Patent: (54) STEVIA PLANT NAMED 'AKH L1'. (50) Latin Name: Stevia rebaudiana (Bert.) Bertoni. Varietal Denomination: AKH L1. (75) Inventor: Edgar Ramon Alvarez Britos, Asuncio

Stevia plant named 'AKH L1'
Jan 18, 2011 - Plt./263.1, 226, 258. See application ?le for complete search history. Primary Examiner * Howard Locker. (74) Attorney, Agent, or Firm * BalleW ...

PD-L1.pdf
(2002) Nat Med 8, 793–800. 2. Thompson ... all of our antibodies in-house. About CST: Page 1 of 1. PD-L1.pdf. PD-L1.pdf. Open. Extract ... Displaying PD-L1.pdf.

Monitoring of medical literature and the entry of relevant information ...
Jun 15, 2015 - marketing authorisation holders need to continue to monitor all other medical literature not covered by the literature reference databases ...

Monitoring of medical literature and the entry of relevant information ...
May 12, 2015 - Revision 1* approved by Pharmacovigilance Business Team 1 ..... If the reporter address/contact is available record inclusion criteria. 2.4.2.

General and Nested Wiberg Minimization: L2 ... - Research at Google
the computer vision community. Recently, Eriksson and van den ... We call the resulting algorithm general Wiberg minimization. As an example of this idea, we ...

C1-L2 - Transformations of Power Functions.pdf
Page 1 of 5. MHF4U1 Date: Cycle 1 - Lesson 2 – Transformations of Polynomial Functions. A power function is the simplest type of polynomial function and has ...

C1-L2 - Transformations of Power Functions - note filled in.pdf ...
Page 2 of 2. Page 2 of 2. C1-L2 - Transformations of Power Functions - note filled in.pdf. C1-L2 - Transformations of Power Functions - note filled in.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensions. Size. Du