Cluster-Wide Context Switch of Virtualized Jobs Fabien Hermenier, Adrien Lèbre, Jean-Marc Menaud

VTDC’10, 22 June 2010

ASCOLA Team

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

1 / 23

Agenda Motivation Global Design Architecture Implementation Proof of concept A sample scheduler Experiment on a cluster Conclusion

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

2 / 23

Motivation

Agenda Motivation Global Design Architecture Implementation Proof of concept A sample scheduler Experiment on a cluster Conclusion

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

3 / 23

Motivation

Motivation

Clusters I

large infrastructures to execute various jobs

Resource Management System (RMS) I

manage the execution of jobs

I

resources are allocated to jobs according to their description

I

scheduling: which jobs to execute, and where ?

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

4 / 23

Motivation

Jobs schedulers Usually A corse-grain exploitation of resources : I

static allocation of resources

I

execution to completion

Dynamic schedulers exist Based on mechanisms that manipulate the jobs dynamically (migration, preemption, dynamic allocation of resources, . . . ). BUT I

mechanisms are complex to implement

I

mechanisms are complex to use efficiently

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

5 / 23

Motivation

Motivation Virtual Machines (VMs) as a backend for dynamic schedulers I

each component is embedded into its VM

I

VMMs provide migration, preemption

I

still complex to use efficiently

A cutting-edge building block dynamic consolidation, best-effort jobs , . . . I I

various policies, but common concepts to perform the changes each provides an ad-hoc solution to handle several common issues: I I I

dependencies between actions correctness reactivity

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

6 / 23

Motivation

Proposition Performing the changes should not be a primary concern for developers I

a generic cluster-wide context switch based on VMs

I

developers only focus on the algorithm to select the jobs to run the cluster-wide context switch takes care of the rest

I

I I I

detects the changes to perform ensures the correctness of the transition computes the fastest possible transition

The implementation leverages the consolidation manager Entropy

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

7 / 23

Global Design

Agenda Motivation Global Design Architecture Implementation Proof of concept A sample scheduler Experiment on a cluster Conclusion

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

8 / 23

Global Design

Architecture

From jobs to virtualized Jobs

Figure: The life cycle of a vjob

I

a vjob encapsulates one or several VMs

I

to change the state of a vjob, actions (except migrate) are executed on each VMs

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

9 / 23

Global Design

Architecture

Configuration I

describes the assignment of the running VMs to working nodes

I

nodes provide CPU and memory resources

I

running VMs require CPU and memory resources to run at peak level

(a) Non-viable configuration

Hermenier et al.

(ASCOLA)

(b) Viable configuration

Cluster-Wide Context Switch of Virtualized Jobs

10 / 23

Global Design

Architecture

The control loop of Entropy

Monitor I

extract the current configuration: VM position, CPU/memory consumption

I

adaptable to a specific monitoring system (currently Ganglia)

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

11 / 23

Global Design

Architecture

The control loop of Entropy

Scheduling policy I

an algorithm to select the vjobs to run wrt. the current configuration

I

provided by a developer

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

11 / 23

Global Design

Architecture

The control loop of Entropy

The cluster-wide context switch module I

selects a position for each VM to run

I

infers the actions that make the transition w. the current configuration

I

computes the fastest plan that ensure the correctness of the process

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

11 / 23

Global Design

Architecture

The control loop of Entropy

Execution I

associate each action of the plan with a driver that performs the action

I

adaptable to specific environments. Currently support Xen VMM (XML-RPC) or shell command

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

11 / 23

Global Design

Implementation

Role of the CW context switch

I

detects the actions to perform

I

selects a position for each VM to run

I

plans the actions to guarantee the correctness of the process

I

computes the fastest possible plan

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

12 / 23

Global Design

Implementation

Plan the actions

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

13 / 23

Global Design

Implementation

Plan the actions

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

13 / 23

Global Design

Implementation

Plan the actions

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

13 / 23

Global Design

Implementation

Plan the actions

The reconfiguration plan I

a protocol to execute actions

I

actions feasible in parallel are grouped into a same step

I

steps are executed sequentially

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

13 / 23

Global Design

Implementation

Suspending/Resuming a vjob I

inter-connected VMs should be continuously in the same state

I

coordination to ensure that distributed applications will not fail

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

14 / 23

Global Design

Implementation

Suspending/Resuming a vjob I I

I I

inter-connected VMs should be continuously in the same state coordination to ensure that distributed applications will not fail

actions are grouped into a same step synchronization between the pause/unpause actions

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

14 / 23

Global Design

Implementation

Reducing the duration of a cluster-wide context switch

45

start/run migrate stop/shutdown

40

local nfs localïscp localïrsync

200

150

30

Completion time (in sec)

Completion time (in sec)

35

25 20 15 10

100

50

5 0 128 256

512

1024 VM size (in MB)

2048

0 128 256

512

1024 VM size (in MB)

I

the duration of an action depends on its context

I

a function estimates the cost of a whole CW context switch

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

2048

15 / 23

Global Design

Implementation

Reducing the duration of a CW context switch An approach based on constraint programing Entropy computes a new configuration that I

is viable

I

respects the scheduling policy

I

implies the minimal cost

In practice I

actions are performed asap.

I

prefer moving VMs will small memory requirements

I

avoid migrations and remote resumes

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

16 / 23

Proof of concept

Agenda Motivation Global Design Architecture Implementation Proof of concept A sample scheduler Experiment on a cluster Conclusion

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

17 / 23

Proof of concept

A sample scheduler

A sample scheduler Principle I

a FIFO queue

I

VMs are assigned to nodes using a First Fit Decrease heuristic

I

priority between jobs to prevent starvation

Example

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

18 / 23

Proof of concept

A sample scheduler

A sample scheduler Principle I

a FIFO queue

I

VMs are assigned to nodes using a First Fit Decrease heuristic

I

priority between jobs to prevent starvation

Example

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

18 / 23

Proof of concept

A sample scheduler

A sample scheduler Principle I

a FIFO queue

I

VMs are assigned to nodes using a First Fit Decrease heuristic

I

priority between jobs to prevent starvation

Benefits using CW context switch I

dynamic allocation of resources

I

preemption

I

migration of VMs

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

18 / 23

Proof of concept

Experiment on a cluster

Environment Hardware I

11 working nodes

I

3 storage nodes share VM images

I

1 service node is running Entropy

Protocol I

a queue of 8 vjobs (NASGrid benchmarks)

I

each vjob uses 9 VMs comparison with regards to FCFS

I

I I

resources usage completion time

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

19 / 23

Proof of concept

Experiment on a cluster

Experiment on a cluster Benefits I

improve resource usage

I

suspend/resume transparent for the developer

Resources usage

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

20 / 23

Proof of concept

Experiment on a cluster

Experiment on a cluster

Benefits I

improve resource usage

I

suspend/resume transparent for the developer

I

reduce the completion time

Cumulated execution time I

FCFS: 250 minutes

I

Entropy: 150 minutes

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

20 / 23

Conclusion

Agenda Motivation Global Design Architecture Implementation Proof of concept A sample scheduler Experiment on a cluster Conclusion

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

21 / 23

Conclusion

Conclusion RMSs start to manage VMs instead of process I

VMMs provide mechanisms to implement dynamic schedulers

I

manipulate VMs is tedious and may be non cost-effective

I

various scheduling policies but common concepts to perform the context switch

A generic cluster-wide context switch I

make the implementation of dynamic schedulers easier

I

the context switch is outside the scheduling algorithm

I

an implementation in Entropy with a sample algorithm http://entropy.gforge.inria.fr version 1.2 (LGPL)

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

22 / 23

Conclusion

I’m looking for a postdoc position I

fond of - virtualization, distributed systems, autonomic computing, . . .

I

dislike - tomatoes

Hermenier et al.

(ASCOLA)

Cluster-Wide Context Switch of Virtualized Jobs

23 / 23

Cluster-Wide Context Switch of Virtualized Jobs

resources are allocated to jobs according to their description. ▷ scheduling: ..... fond of - virtualization, distributed systems, autonomic computing, . . . ▷ dislike - ...

690KB Sizes 9 Downloads 129 Views

Recommend Documents

Cluster-Wide Context Switch of Virtualized Jobs - Fabien Hermenier
developers only focus on the algorithm to select the jobs to run. ▷ the cluster-wide context switch takes care of the rest. ▻ detects the changes to perform.

Cluster-Wide Context Switch of Virtualized Jobs - Fabien Hermenier
Architecture. Implementation ... The implementation leverages the consolidation manager Entropy .... actions feasible in parallel are grouped into a same step.

Performance Models for Virtualized Applications
new tools for predicting performance, providing information for resource alloca- .... In other words, models depend on data collected by measurement tools in ...

Towards Real-time Management of Virtualized ...
vendors has been to sell the necessary software, equip- ment, and appliance nodes to realize and manage large-scale ... high volume servers, storage devices, and switches. In this paper we focus on Network Function .... Abstract Network Application o

Virtualized Server Host Lists
WinPak is a software that controls access control to all of our building door lock systems. This service used to run on ... Pittsburg Community Schools Mail - Virtualized Server Host Lists ... When the 4th physical (ESX) Host is purchased for the SAN

Cheap Newest 1Pcs Free Tv Diseqc Switch 4X1 Diseqc Switch ...
Cheap Newest 1Pcs Free Tv Diseqc Switch 4X1 Diseqc ... For Tv Receiver Free Shipping & Wholesale Price.pdf. Cheap Newest 1Pcs Free Tv Diseqc Switch ...

Cheap Diseqc Switch 4X1 Diseqc Switch Satellite Antenna Flat Lnb ...
Cheap Diseqc Switch 4X1 Diseqc Switch Satellite Ante ... For Tv Receiver Free Shipping & Wholesale Price.pdf. Cheap Diseqc Switch 4X1 Diseqc Switch ...

CAPACITANCE LEVEL SWITCH
circuit and relay will be activated. As Capacitance Level Switch has no moving parts inside the device, it will not be affected by friction. It is suitable for powder or liquid application easy to install. The customer can choose the types for his re

CAPACITANCE LEVEL SWITCH
timer in clockwise. The relay will energized after. "Indicator" illuminate for several seconds if set timer more than 0 second. The delay function is suitable for ...

SWITCH eng.pdf
These are mostly SMD components. The processor is in the THT, which will. facilitate the exchange of. Page 3 of 10. SWITCH eng.pdf. SWITCH eng.pdf. Open.

Slide switch plug
#define second_led 13 ​//second LED is connected to 13th pin. #define first_datapin 10 ​//D1 of slide switch is connected to. 10th pin. #define second_datapin ...

Light triggered light switch
Dec 25, 2012 - ee app lcanon e or Comp ete Seam lstory' is actuated by light of su?icient .... Will be used to make this calculation. Where no is 4 pi>

Preserving I/O Prioritization in Virtualized OSes
First, CPU accounting in guest OSes can be inaccurate under discontinuous time, leading to false identi- cation of I/O-bound task as compute-bound. Second and most importantly, work-conserving (WC) scheduling, which is designed for continuous CPU ava

On the Representation of Context
machinery that draws both on the ideas of the anti-formalist Grice–Austin tradition .... the time of utterance, this will not be true for answering machine messages.

The Neighborhood Context of Homelessness - Esri
Apr 1, 2013 - assistance income) and unstable neighborhoods (higher proportions of 1-person ... With this knowledge, city planners and homeless service providers can better use limited ..... 1996;7(2):327-365. 15. ... [computer program].

SPEAKER ADAPTATION OF CONTEXT ... - Research at Google
adaptation on a large vocabulary mobile speech recognition task. Index Terms— Large ... estimated directly from the speaker data, but using the well-trained speaker ... quency ceptral coefficients (MFCC) or perceptual linear prediction. (PLP) featu

On the Representation of Context
the information on which context-dependent speech acts depend, and the situation that speech acts ..... The other was in fact the Secretary of Health and Human.

PDF Microsoft System Center: Building a Virtualized Network Solution ...
PDF Microsoft System Center: Building a. Virtualized Network Solution Full Books. Books detail. Title : PDF Microsoft System Center: Building a q. Virtualized ...

Tracing Packet Latency across Different Layers in Virtualized Systems
Aug 5, 2016 - tracing mechanisms and this calls for a system level and application transparent tracing tool. There exist ... trace network latency at packet level in virtualized environ- ments. TC timestamps packets at ..... its fair CPU share, it al