Optimal Multiprocessor Real-Time Scheduling based on RUN for Practical Imprecise Computation with Harmonic Periodic Task Sets Hiroyuki Chishiro Graduate School of Industrial Technology, Advanced Institute of Industrial Technology, Japan Abstract—Optimal multiprocessor real-time scheduling can achieve full system utilization with implicit-deadline periodic task sets. However, worst case execution time (WCET) analysis is difficult on state-of-the-art hardware/software platforms because of the complex hierarchy of shared caches and multiprogramming. The actual case execution time (ACET) of each task is usually shorter than its WCET, and imprecise computation is one effective method to make better use of the remaining processor time. Unfortunately, there is no optimal multiprocessor real-time scheduling that supports imprecise computation. This paper proposes an algorithm that integrates the Reduction to Uniprocessor (RUN) algorithm as an optimal multiprocessor realtime scheduling and the Rate Monotonic with Wind-up Part (RMWP) algorithm as an imprecise-based real-time scheduling. This RUN-RMWP algorithm dominates the Partitioned RMWP algorithm. Simulation studies show that RUN-RMWP has a comparable number of preemptions/migrations to RUN and outperforms other imprecise-based non-optimal multiprocessor real-time scheduling algorithms.

I.

I NTRODUCTION

Multiprocessors have been increasingly used in state-ofthe-art real-time applications such as humanoid robots [1], [2]. These robots usually have periodic real-time tasks with harmonic relationships (harmonic periodic task sets) where task periods are integer multiples of each other. Harmonic periodic task sets improve the utilization bound of real-time scheduling [3] and the precision of schedulability test [4] compared with general task sets that do not have a relationship among the task periods. There are two main multiprocessor real-time scheduling categories: partitioned scheduling and global scheduling. Partitioned scheduling assigns tasks to processors offline, but guarantees only 50% processor utilization in the worst case [5]. In contrast, global scheduling can achieve 100% processor utilization by migrating tasks among processors online, but increases run-time overhead. This paper is interested in optimal multiprocessor real-time scheduling algorithms that can achieve 100% processor utilization with implicit-deadline periodic task sets. Several optimal multiprocessor real-time scheduling algorithms have been proposed [6], [7], [8], [9], [10]. Reduction to Uniprocessor (RUN) outperforms other algorithms in the number of preemptions/migrations and is the focus of this research. RUN transforms the multiprocessor real-time scheduling problem into an equivalent set of uniprocessor problems using the DUAL and PACK operations (details of these operations are given in Section IV). Earliest Deadline First (EDF) [11]

is optimal for implicit-deadline task sets on uniprocessors, and hence RUN uses it to transform uniprocessor scheduling into multiprocessor scheduling online. Using these operations, RUN achieves its optimality with the small number of preemptions/migrations. Real-time scheduling analyzes schedulability using the worst case execution time (WCET) of each task. However, WCET analysis is difficult on state-of-the-art hardware/software platforms because of the complex hierarchy of shared caches and multiprogramming. In addition, humanoid robots run in dynamic environments, the actual case execution time (ACET) of each task fluctuates and is usually shorter than its WCET in these real-time applications. Imprecise computation [12] is one effective method to make better use of the remaining processor time (e.g., WCET - ACET). The imprecise computation model has a mandatory real-time part and an optional non-real-time part. By terminating the optional part, each task avoids its deadline miss. The imprecise computation model does not work well in actual systems because it does not consider the processing required to terminate or complete the optional part of each task. Therefore, this paper uses an extended imprecise computation model [13] that has a wind-up (second mandatory) part immediately after the optional part. Only two multiprocessor real-time scheduling algorithms support the extended imprecise computation model: Global Rate Monotonic with Wind-up Part (G-RMWP) [14] and Partitioned Rate Monotonic with Wind-up Part (P-RMWP) [15]. These RMWP-based algorithms are semi-fixed-priority scheduling [16] in the extended imprecise computation model and have an original parameter called optional deadline. An optional deadline is the time at which an optional part must be terminated to avoid the deadline miss of a wind-up part. Thanks to the optional deadline, G-RMWP and P-RMWP can dominate Global Rate Monotonic (G-RM) [17] and Partitioned Rate Monotonic (P-RM) [11], respectively. Unfortunately, GRMWP and P-RMWP are not optimal on multiprocessors because they cannot utilize the processor 100%. This paper proposes Reduction to Uniprocessor for Rate Monotonic with Wind-up Part (RUN-RMWP), an optimal multiprocessor real-time scheduling algorithm based on RUN for practical imprecise computation with harmonic periodic task sets. RUN-RMWP supports the extended imprecise computation model that improves Quality of Service (QoS) and achieves optimal full system utilization. In addition, RUNRMWP manages the optional deadline timer that terminates the optional part of each task in RUN-based scheduling. Simulation results show that RUN-RMWP achieves its optimality

Discarded

remaining execution time Ri(t)

Optional part Completed

Mandatory part

general scheduling

mi+wi

Wind-up part

semi-fixed-priority scheduling

mi

Terminated

wi Fig. 1.

Extended imprecise computation model

τ

general

τ

semi-fixed-priority

τ

τ

1

i

2

0

OD1 Mandatory part

time

Optional part Deadline

Release

0

OD2

time

i

0

Wind-up part

Optional Deadline

Release Fig. 2.

Optional deadline

Fig. 3.

and has a comparable number of preemptions/migrations to RUN. Contribution: This paper presents RUN-RMWP, an optimal multiprocessor real-time scheduling algorithm for practical imprecise computation with harmonic periodic task sets. RUNRMWP can reduce to P-RMWP if assigning tasks to processors is successful and RUN can reduce to Partitioned EDF (P-EDF). Therefore, RUN-RMWP as well as RUN can achieve the small number of preemptions/migrations. In addition, RUN-RMWP illustrates how RUN is applied to imprecise computation. The remainder of this paper is organized as follows. Section II introduces the extended imprecise computation model and RUN’s specific model. Section III explains semi-fixed-priority scheduling in the extended imprecise computation model and RMWP algorithm. Section IV introduces RUN and gives a scheduling example. Section V presents RUN-RMWP as an optimal multiprocessor real-time scheduling algorithm that supports the extended imprecise computation model. Section VI evaluates the effectiveness of RUN-RMWP through simulation. Section VII compares this work with related work, and Section VIII concludes this paper. II.

S YSTEM M ODEL

This section introduces the system model that supports the extended imprecise computation model [13] as well as RUN’s specific model [10]. Figure 1 shows the extended imprecise computation model [13], which adds a wind-up part to the imprecise computation model [12]. The imprecise computation model assumes that the processing to terminate or complete the optional part is not required. However, image processing tasks in robots entail a mandatory processing stage prior to task completion in order to output the overall task results. To guarantee the schedulability of these tasks, the imprecise computation model is extended with a wind-up part. This paper assumes that a task set Γ has n periodic independent tasks τ1 ,...,τn on M identical processors P1 ,...,PM . The task set is synchronous (i.e., all tasks are initially released at time t = 0). Each task τi has its WCET Ci , period Ti , and relative deadline Di . The relative deadline Di of task τi is equal to its period Ti . The priority of each task is

mi mi+wi

Mandatory part Deadline

ODi ODi+wi Di time

Wind-up part Optional Deadline

General scheduling and semi-fixed-priority scheduling

represented by pi . The utilization of each task is represented 1 n by Ui = Ci /Ti and the system utilization is U = M i=1 Ui . Each instance of a task is called a job. The extended imprecise computation model adds the windup part as a second mandatory part. Therefore, the WCET of each task is Ci = mi + wi , where mi is the WCET of the mandatory part and wi is the WCET of the wind-up (second mandatory) part. The Required Execution Time (RET) of the optional part of each task is represented as oi and its utilization is Uio = oi /Ti . This paper assumes that the RET of the optional part of each task fluctuates and its WCET is unknown. The reason Ui does not include the RET of optional part oi is because the optional part of each task is a non-real-time part, and hence completing it is not relevant to scheduling the task set successfully. The relative optional deadline ODi of task τi is defined as the time when an optional part is terminated and a wind-up part is released [16]. Each wind-up part is ready for execution after each optional deadline and can be completed if each mandatory part is completed by the optional deadline. If the mandatory part of each task is not completed by its optional deadline, the corresponding wind-up part may miss its deadline. Note that the corresponding wind-up part may complete its execution by its deadline in such case. Figure 2 shows the optional deadline of each task. Each solid up-arrow, solid down-arrow, and dotted down-arrow represents the release time, deadline, and optional deadline, respectively. Task τ1 completes its mandatory part before optional deadline OD1 and then executes its optional part until OD1 . After OD1 , task τ1 executes its wind-up part. In contrast, task τ2 does not complete its mandatory part by optional deadline OD2 . As a result, when τ2 completes its mandatory part, it executes its wind-up part and does not execute its optional part. III.

S EMI -F IXED -P RIORITY S CHEDULING

Semi-fixed-priority scheduling [16] is defined as part-level fixed-priority scheduling in the extended imprecise computation model [13]. That is to say, semi-fixed-priority scheduling fixes the priority of each part in the extended imprecise task and changes the priority of each extended imprecise task in just two cases: (i) when the extended imprecise task completes its mandatory part and executes its optional part and (ii) when the

Higher Priority

of the time τi interferes with τk in RMWP on uniprocessors, is   Tk Iki = (mi + wi ). Ti

Lower Priority SQ Optional part Mandatory part Wind-up part Empty Sleep

Note that Theorem 1 can be adapted to harmonic as well as generalperiodic task sets. In the case of harmonic periodic  Tk Tk task sets, Ti = Ti .

RTQ

Scheduler

Fig. 4.

NRTQ

Task queue

extended imprecise task terminates or completes its optional part and executes its wind-up part. Figure 3 shows the difference between general scheduling in Liu and Layland’s model [11] and semi-fixed-priority scheduling in the extended imprecise computation model. In Figure 3, one task is not interfered with by higher priority tasks. In general scheduling, when task τi is released at time 0, then the remaining execution time Ri (t) is set to mi +wi and monotonically decreases until Ri (t) becomes 0 at time mi +wi . In semi-fixed-priority scheduling, when task τi is released at time 0, then Ri (t) is set to mi and monotonically decreases until Ri (t) becomes 0 at time mi . When Ri (t) is 0 at time mi , then τi sleeps until time ODi . When τi is released at time ODi , then Ri (t) is set to wi and monotonically decreases until Ri (t) becomes 0 at time ODi + wi . If τi does not complete its mandatory part by time ODi , then Ri (t) is set to wi at the time when τi completes its mandatory part. In general scheduling as well as semi-fixed-priority scheduling, τi completes its windup part by time Di . RMWP [16] is a semi-fixed-priority scheduling algorithm that uses the extended imprecise computation model on uniprocessors. As shown in Figure 4, RMWP manages three task queues: Real-Time Queue (RTQ), Non-Real-Time Queue (NRTQ), and Sleep Queue (SQ). RTQ holds tasks that are ready to execute their mandatory or wind-up parts in Rate Monotonic (RM) order [11]. A task is not allowed to execute its mandatory and wind-up parts simultaneously. NRTQ holds tasks that are ready to execute their optional parts in RM order. Every task in RTQ has higher priority than those in NRTQ. SQ holds tasks that have completed their optional parts by their optional deadlines or their wind-up parts by their deadlines. The calculation of each optional deadline in RMWP is shown in [16]. This paper now explains how to calculate the relative optimal optional deadline of each task in RMWP using Response Time Analysis for Optimal Optional Deadline with Harmonic periodic task sets (RTA-OODH) [16]. The relative optimal optional deadline is defined as the time when the remaining execution time of each task in time interval [ODi , Di ) is equal to the sum of the WCET of its wind-up part wi and the worst case interference time from higher priority tasks. That is to say, there is no remaining time to execute the optional part of each task after the optimal optional deadline if the ACET of each task is always equal to its WCET. First, the following theorems are introduced for the proposed algorithm in this paper. Theorem 1 (From Theorem 1 in [16]). The worst case interference time Iki (∀i : pi > pk ), which is the upper bound

Theorem 2 (From Theorem 5 in [16]). The assignable time of task τk except wk in RMWP on uniprocessors with harmonic periodic task sets is  Ak = Dk − wk − Iki . ∀i:pi >pk

This paper next introduces the worst case interference time of each task. Theorem 3 (From Theorem 6 in [16]). The worst case interference time Ik of task τk in RMWP on uniprocessors with harmonic periodic task sets is      ODk  ODk − ODi Ik = mi + wi . Ti Ti ∀i:pi >pk

Using these theorems, this paper explains RTA-OODH for calculating the relative optimal optional deadline of each task in RMWP on uniprocessors. Theorem 4 (From Theorem 7 in [16]). The relative optimal optional deadline ODk of task τk in RMWP by RTA-OODH on uniprocessors with harmonic periodic task sets is ODk = Ak + Ik , where Ak and Ik are in Theorems 2 and 3, respectively. RTA-OODH is similar to Response Time Analysis (RTA) [18]. RTA calculates the worst case response time of each task; however, RTA-OODH calculates the relative optimal optional deadline of each task. IV.

RUN A LGORITHM

This paper reviews RUN [10], an optimal multiprocessor real-time scheduling algorithm with the small number of preemptions/migrations. This paper explains the details of RUN in both offline and online phases. This paper now introduces the model specific to RUN [10] as RUN has many original parameters and assumptions are needed to explain it. The goal of RUN is full system utilization, and hence idle tasks are inserted in order to achieve  it. The total utilization of idle tasks is Uidle = M − i Ui . Note that each idle task depends on the utilization parameter alone and does not depend on other parameters such as period and WCET. RUN transforms multiprocessor scheduling into uniprocessor scheduling by aggregating tasks into servers S. This paper defines servers as tasks with sequences of jobs, but they are not actual tasks in the system; each server is a proxy for a collection of client tasks. When a server is running, the processor time is used by one of its clients. Server clients

are scheduled via an internal scheduling mechanism. The utilization of  each server does not exceed one and server Sl is Ulsrv = τi ∈Sl Ui , where τi ∈ Sl indicates that task τi is assigned to server Sl . A. Offline Phase In the offline phase, RUN reduces multiprocessor scheduling to uniprocessor scheduling by the DUAL and PACK operations. RUN is based on EDF because EDF is optimal with implicit deadline task sets on uniprocessors. The DUAL operation transforms a task τi into the dual task τi∗ , whose execution time represents the idle time of τi (i.e., Ci∗ = Ti − Ci ). The relative deadline of dual task τi∗ is equal to that of task τi . The DUAL operation reduces the number of processors whenever n − M < M . The PACK operation packs dual servers into packed servers whose utilizations do not exceed one. When n − M ≥ M , the number of servers can be reduced by aggregating them into fewer servers using the PACK operation. The scheme for packing servers to fewer servers is a heuristic. That is to say, its PACK operation is similar to the partitioning schemes (e.g., first-, next-, best-, and worst-fit). Note that if assigning tasks to processors is successful, RUN generates the same schedule as P-EDF and does not perform the DUAL and PACK operations. Otherwise, the DUAL and PACK operations generate the reduction tree offline, which is then used to make server scheduling decisions online. Details of how to make scheduling decisions in the reduction tree are given in the next subsection. In order to explain the reduction tree, this paper defines the following terms with respect to servers as follows. • • •

unit server: the utilization of the server is one null server: the utilization of the server is zero root server: the last packed server whose utilization is one (unit server)

Packing the dual servers of packed servers can reduce the number of servers by nearly half. This paper performs DUAL and PACK operations repeatedly until all packed servers become unit servers and define the REDUCE operation to be their composition. Definition 5 (From Definition IV.6. in [10]). Given a set of servers Γ and a packing π of Γ, a REDUCE operation on a server S in Γ, denoted by ψ(S), is the composition of the DUAL operation ϕ with the PACK operation σ for π (i.e., ψ(S) = ϕ(σ(S))). In addition, this paper defines reduction level/sequence to explain the reduction tree as follows. Definition 6 (From Definition IV.7 in [10]). Let i ≥ 1 be an integer, Γ be a set of servers, and S be a server in Γ. The operator ψ i is recursively defined by ψ 0 (S) = S and ψ i (S) = ψ ◦ ψ i−1 (S). Then {ψ i }i is a reduction sequence, and the server system ψ i (Γ) is said to be at reduction level i. Note that the number of servers at reduction level 0 is the same as those at reduction level 1, if these servers exist.

σ(ψ2(Γ))

S14(1),{5N,10N,15N}

σ

ψ2(Γ)

S11(0.2),{5N,10N} S12(0.2),{10N,15N}

S13(0.6),{5N}

φ 1

σ(ψ (Γ))

σ({S6,S7})

σ({S8,S9})

σ({S10})

σ

ψ (Γ) 1

S6(0.4),{5N} S7(0.4),{10N} S8(0.4),{15N}S9(0.4),{10N} S10(0.4),{5N}

φ

σ(ψ0(Γ)) σ({S1}) σ({S2}) σ({S3}) σ({S4}) σ({S5}) σ

ψ0

S1(0.6),{5N} S2(0.6),{10N} S3(0.6),{15N} S4(0.6),{10N} S5(0.6),{5N}

τ

(2,5) 1

Fig. 5.

τ

2

(4,10)

τ

(6,15) 3

τ

4

(4,10)

τ

5

(2,5)

Reduction tree on three processors

Figure 5 shows the reduction tree on three processors. This paper gives further details of the following tuple (Ci , Ti ) to (C ,T ) explain WCET and task period τi as τi i i , and hence all (2,5) (4,10) (6,15) (4,10) periods and WCETs of tasks are τ1 , τ2 , τ3 , τ4 , (2,5) and τ5 . Tasks τ1 , τ2 , τ3 , τ4 , and τ5 are assigned to servers S1 , S2 , S3 , S4 , and S5 at reduction level 0,  respectively. The n total utilization of idle tasks is Uidle = M − i Ui = 3 − 5 ∗ 0.4 = 1. In this example, idle tasks are assigned to servers at reduction level 0 uniformly, and hence the utilization of each server is added to Uidle = 15 = 0.2, respectively. n (U srv ),{D }

l This paper represents a server as Sl l , where Ulsrv is the utilization of server Sl and Dl is the deadline set of server Sl . The deadline set includes all absolute task deadlines in the server. Each server sets the earliest deadline in each deadline set when the server is released. This paper assigns deadline sets 5N , 10N , 15N , 10N , and 5N to servers at reduction level 0, respectively, where N indicates a natural number. For example, 5N represents all deadlines of the task whose relative deadline is 5. Servers S6 , S7 , S8 , S9 , and S10 are generated by the DUAL operation at reduction level 1 and their utilizations are all 0.4, because these servers are dual servers of servers S1 , S2 , S3 , S4 , and S5 at reduction level 0, respectively. In this example, servers S6 and S7 are packed into server S11 , servers S8 and S9 are packed into server S12 , and server S10 is packed into server S13 by the PACK operation. Servers S11 , S12 , and S13 are generated by the DUAL operation at reduction level 2. Finally, server S14 is generated by the PACK operation at reduction level 2 and its utilization is one. The REDUCE operation is finished and the reduction tree is completely generated.

Note that the number of root servers may become greater than one because when all servers are unit servers at the highest reduction level, the REDUCE operation is finished. If one server is a unit server, then its dual server is a null server that is packed into another server when the next PACK operation is performed.

Reduction Level 2 S12 S11 S13

VP2,1 S11 S13 0

1

2

3

VP1,2 S10 S6 1

VP0,1 S1

S7 2

8

6

5

4

S7

8

9

10

Fig. 7.

6

5

4

P2 P3

τ τ τ τ

7

8

9

10

10

RUN uses the following task-to-processor assignment scheme: (i) leave executing tasks on their current processors; (ii) assign idle tasks to their last-used processor when available to avoid unnecessary migrations; (iii) assign remaining tasks to free processors arbitrarily. According to this scheme, each server assigns tasks to processors P1 , P2 , or P3 in Figure 6. When each task completes its execution on one processor, the processor becomes idle until the server of each task exhausts its budget. For example, server S5 , running on V P0,1 , completes task τ5 on processor P1 at time 3 and P1 becomes idle (executes idle task) during the time interval [3,4).

4

τ

τ

τ

1

2

5

1

τ

2

3

0 Fig. 6.

τ

5

1

1

2

3

4

5

RUN-RMWP algorithm

decisions at time 5. This system has three processors P1 , P2 , and P3 ; reduction level 0 has three virtual processors V P0,1 , V P0,2 , and V P0,3 ; reduction level 1 has two virtual processors V P1,1 and V P1,2 ; and reduction level 2 has one virtual processor V P2,1 .

S1

Processor P1

When server Sl becomes ready: if the running server with the lowest priority has lower priority than Sl , preempt the running server. When server Sl begins running on virtual processor V PL,v : a) If Sl is at reduction level 0, schedule ready tasks in Sl in RMWP order. b) Otherwise, Sl schedules the child servers of reduction tree by Rules 7 and 8. When server Sl goes to sleep (i.e., exhausts its budget): set the release time to the next earliest deadline of all tasks in Sl and put Sl to sleep until the release time of Sl .

S5 S2

3

2)

3)

7

S1

VP0,3 S3 2

10

Reduction Level 0 S4 S1

1

9

S8

3

S5

VP0,2 S2

0

7

Reduction Level 1 S10 S10 S6

VP1,1 S9

0

6

5

4

1)

6

7

8

9

Example of RUN scheduling on three processors

B. Online Phase In an online phase, RUN schedules servers according to the following rules from [10] using Figure 5 for reference. Rule 7 (From Rule IV.2 in [10]). If a packed server is running (circled), execute the child node with the earliest deadline among those children with work remaining; if a packed server is not running (not circled), execute none of its children. Rule 8 (From Rule IV.3 in [10]). Execute (circle) the child (packed server) of a dual server if and only if the dual server is not running (not circled). In the reduction tree, a thick arrow represents a scheduled server and a thin arrow represents a non-scheduled server according to each parent server. If a thick arrow from a server points to a task, the server schedules the task. In Figure 5, root server S14 is always running regardless of these rules, because a root server is always a unit server. Next, S14 makes scheduling decisions in EDF order. Server S12 is running at this time. Since server S12 is running, S8 and S9 are not running by Rule 7. Since servers S11 and S13 are not running, servers S7 and S10 are running by Rule 8. Servers S6 , S8 , and S9 are not running, and hence servers S1 , S3 , and S4 are running by Rule 8. Figure 6 shows an example of RUN scheduling on three processors. Each server is executed on virtual processor V PL,v , where L represents the reduction level and v represents the virtual processor ID at each reduction level. The task set is shown in Figure 5; this example shows the scheduling

V.

RUN-RMWP A LGORITHM

This paper proposes RUN-RMWP, an optimal multiprocessor real-time scheduling algorithm that supports the extended imprecise computation model with harmonic periodic task sets. RUN-RMWP successfully schedules every synchronous taskset with harmonic periodic task sets if and only if n 1 i=1 (mi + wi )/Ti ≤ 1, according to the extended imM precise computation model. As in RUN, RUN-RMWP makes server schedules in EDF order, and hence RUN-RMWP can use Rules 7 and 8. In addition, RUN-RMWP makes task scheduling decisions in RMWP order [16]. Using the idea of combination of server and task scheduling, RUN-RMWP achieves optimal multiprocessor real-time scheduling for practical imprecise computation. Figure 7 shows the RUN-RMWP algorithm. RUN-RMWP makes server scheduling decisions under the following conditions: (1) server Sl becomes ready; (2) server Sl begins running on virtual processor V PL,v ; (3) server Sl goes to sleep (exhausts its budget). Conditions (1) and (3) in RUN-RMWP are the same as those in RUN, and hence RUN-RMWP and RUN generate the same server scheduling. In order to make task scheduling decisions under condition (2) in RUN-RMWP, this paper extends the technique to calculate the relative optimal optional deadline of each task in RMWP for RUN-RMWP. This is done because RMWP schedules tasks running on a processor while RUN-RMWP schedules servers running on a virtual processor, where the utilization of each server may be

RT A − OODH(Γ) { while (τk ∈ Γ) {  Ak = Dk · Ulsrv − wk − ∀i:τi ,τk ∈Sl ∧pi >pk Iki ; Ik = 0; do { ODk = Ak + Ik ;       ODk ODk −ODi Ik = ∀i:τi ,τk ∈Sl ∧pi >pk m wi ; + i T T }

}

Fig. 8.

} while (Ak + Ik > ODk );

i

task τk is interfered with by τim in ODk /Ti times and by τiw in (ODk − ODi )/Ti times. Theorem 11 (Optional Deadline in RUN-RMWP). The relative optimal optional deadline ODk of task τk in server Sl in RUN-RMWP on multiprocessors according to RTA-OODH is

i

ODk = Ak + Ik ,

(2)

where Ak and Ik are from Theorems 9 and 10, respectively. Pseudo code of RTA-OODH in RUN-RMWP

less than one (an important difference between processors and servers). A. Optional Deadline This paper now calculates the relative optional deadline of each task in RUN-RMWP using RTA-OODH [16]. The relative optional deadline of each task in RUN-RMWP depends on the execution time of the server, whose utilization may be less than one. That is to say, the optional deadline of each task is not fixed against the processor time because a server (except for the root server) may not always be running. First of all, this paper analyzes the assignable time Ak of task τk except wk in server Sl . Theorem 9 (Assignable Time in RUN-RMWP). The assignable time Ak of task τk except wk in server Sl in RUNRMWP on multiprocessors is  Ak = Dk · Ulsrv − wk − Iki , (1) ∀i:τi ,τk ∈Sl ∧pi >pk

where Iki is from Theorem 1. Proof: The differences between this theorem and Theorem 2 are that (1) the first parameter Dk is transformed into Dk · Ulsrv and (2) task τk is interfered with from higher priority tasks in server Sl . The least common multiple of periods of task τk and tasks with higher priority than τk is equal to Tk (Dk ) with harmonic periodic task sets. Next, Dk is changed into Dk · Ulsrv because this theorem considers that the utilization of server Sk is less than or equal to one. The worst case interference time of task τk considers only higher priority tasks in server Sl . Note that the worst case interference time of each job is constant with harmonic periodic task sets. Hence, the assignable time Ak of task τk except wk is equal to Equation 1, and this theorem holds. Theorem 10 (Worst Case Interference Time in [0, ODk ) in RUN-RMWP). The worst case interference time Ik of each task τk in server Sl in RUN-RMWP on multiprocessors is      ODk − ODi ODk Ik = mi + wi . Ti Ti ∀i:τi ,τk ∈Sl ∧pi >pk

Proof: The difference between this theorem and Theorem 3 is that this theorem considers only tasks assigned to each server. This paper considers that one extended imprecise task τi is split into two general tasks τim and τiw . Task τim releases the first job at time 0 and its period is Ti . Task τiw releases the first job at time ODi and its period is Ti . Hence, in [0, ODk ),

Proof: In Equation 2, the relative optional deadline ODk and assignable time Ak of task τk are the response time and WCET in RTA [18], respectively. The optional part of task τk is not terminated or discarded when the relative optional deadline defined by Equation 2 occurs, as there is time to execute its optional part if the ACET of task τk is always equal to its WCET. The assignable time of each job is constant with harmonic periodic task sets. The assignable time Ak of task τk except wk in [ODk , Dk ) is equal to wk , and hence the relative optional deadline by Equation 2 is optimal. Figure 8 shows the pseudo code of RTA-OODH in RUNRMWP. This pseudo code calculates the relative optimal optional deadline by iteration, similarly to RTA-OODH in RMWP [16]. Using this calculation offline, RUN-RMWP avoids the deadline miss because of the overrun of the optional part online. B. Optional Deadline Timer The optional deadline in RUN-RMWP by Theorem 11 depends on the utilization of the server, which may be less than one. In order to terminate the optional part and release the wind-up part at the optional deadline, RUN-RMWP must manage the optional deadline timer, which generates the timer interrupt at the optional deadline. This paper defines the current execution time and current execution cost of server Sl at reduction level 0 as ET (Sl ) and EC(Sl ), respectively, in order to keep track of server budgets. Note that ET (Sl ) = 0 when server Sl is released and ET (Sl ) = EC(Sl ) when server Sl is completed. This paper explains how to manage the optional deadline timer in RUN-RMWP as follows: • •

• •

When server Sl is released, set ET (Sl ) = 0. When server Sl begins running at time t, start all optional deadline timers of tasks in server Sl (if their optional deadlines do not expire), set the optional deadline timer of task τi to t + ODi − ET (Sl ). When server Sl is completed, set ET (Sl ) = EC(Sl ). When server Sl is preempted after running in time interval ti, stop all optional deadline timers of tasks in server Sl (if started) and set ET (Sl ) = ET (Sl )+ti.

RMWP-based algorithms other than RUN-RMWP set the optional deadline timer of each task if it becomes ready because each processor is always running and the optional deadline is fixed against the processor time. However, RUNRMWP sets the optional deadline timer of each task if its server begins running because each server is not always running and the optional deadline is not fixed against the processor time.

Processor P1 P2 P3

Each task executes its optional part without deadline miss, thanks to its optional deadline timer, and hence RUN-RMWP can achieve optimality and improve the QoS.

τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ τ 5

1

2

5

5

2

1

3

0

4

1

2

3

5

4

Mandatory part Fig. 9.

1

1

5

2

2

3

1

4

4

6

7

Optional part

5

5

D. Schedulability Analysis

1

8

9

10

Wind-up part

Example of RUN-RMWP scheduling on three processors TABLE I. Task τ1 τ2 τ3 τ4 τ5

mi 1 2 3 2 1

oi 2 4 6 4 2

TASK SET wi 1 2 3 2 1

ODi 2 4 6 4 2

Di 5 10 15 10 5

Ti 5 10 15 10 5

This optional deadline timer can apply to other task models that are allowed to release a subtask at an offset of release time [19]1 . C. Example Figure 9 shows an example of RUN-RMWP scheduling on three processors using the task set listed in Table I. In addition, the utilization of each task is equal to that shown in Figure 5 because Ci = mi + wi and all relative deadlines and periods have the same value. Making server scheduling decisions in RUN-RMWP is the same as that in RUN; hence this paper uses Figure 6 for reference. The optional deadline of each task is calculated by Theorem 11. Note that the optional deadline timer of each task may be restarted because the utilization of each server is 0.6 (less than 1.0). This paper now explains how to manage the optional deadline timer of task τ2 in server S2 using Figure 6 for reference. •

• • • •

1 This

At time 0, server S2 is released and ET (S2 ) is set to zero. At the same time, server S2 begins running and the optional deadline timer for τ2 is set to t + OD2 − ET (S2 ) = 0 + 4 − 0 = 4. Note that EC(S2 ) = D2 ∗ U2srv = 10 ∗ 0.6 = 6. At time 3, server S2 is preempted after running in time interval 3, the optional deadline timer for τ2 is stopped and ET (S2 ) = 3. At time 6, server S2 begins running and the optional deadline timer for τ2 is set to t + OD2 − ET (S2 ) = 6 + 4 − 3 = 7. At time 7, the optional deadline timer for task τ2 expires, and hence task τ2 terminates its optional part and starts its wind-up part. At time 9, task τ2 completes its wind-up part. At the same time, server S2 is also completed (exhausts its budget) after running in time interval 3 because ET (S2 ) = 6 (i.e., (ET (S2 ) = EC(S2 ))). optional deadline timer is similar to the deadline move timer in [19].

This paper analyzes the optimality of RUN-RMWP. First, this paper introduces the following theorems. Theorem 12 (From Theorem IV.3 in [10]). RUN is an optimal multiprocessor real-time scheduling algorithm. Theorem 13 (From Theorem 8 in [16]). RMWP is an optimal uniprocessor real-time scheduling algorithm with harmonic periodic task sets. Using these theorems, this paper next analyzes the optimality of RUN-RMWP with harmonic periodic task sets. Theorem 14 (Optimality of RUN-RMWP). RUN-RMWP is an optimal multiprocessor scheduling algorithm with harmonic periodic task sets. Proof: RUN-RMWP and RUN generate the same reduction tree and server scheduling, and hence server scheduling decisions made by RUN-RMWP are optimal. Here, RUN transforms uniprocessor EDF scheduling into multiprocessor scheduling. Since EDF is an optimal uniprocessor real-time scheduling algorithm, RUN is optimal by Theorem 12. RMWP is an optimal uniprocessor scheduling algorithm with harmonic periodic task sets by Theorem 13, and hence task scheduling decisions in RUN-RMWP are also optimal. Hence, this theorem holds. By Theorem 14, RUN-RMWP achieves optimal multiprocessor real-time scheduling and reveals how to apply RUN for imprecise computation with harmonic periodic task sets. VI.

S IMULATION S TUDIES

A. Simulation Setups This simulation uses 1, 000 task sets in each system utilization. The system utilization U is selected within [0.3, 0.35, 0.4, ..., 1.0]. RUN-RMWP, RUN, G-RMWP, and PRMWP algorithms are evaluated. In simulation environments, the number of processors M is 4. Each Ui is selected within [0.02, 0.03, 0.04, ..., 1.0] and is split into two utilizations that are assigned to mi and wi , respectively, for imprecise-based algorithms (i.e., RUN-RMWP, G-RMWP, and P-RMWP). The splitting algorithm is that mi is first selected within [0.01, 0.02, ..., Ui − 0.01] and wi is next set to Ui − mi . In contrast, RUN does not perform the above operation. This paper assumes that the ACET of each task fluctuates and is usually shorter than its WCET. The ratio of ACET/WCET is set to the ranges of 1.0, [0.75, 1.0], and [0.5, 1.0]. The period Ti of each task τi is selected within [100, 200, 400, 800, 1600]. Each Uio is selected within [0.01, 1.0], represented as RUNRMWP-OP2 if the evaluated algorithm is RUN-RMWP. If Uio is always equal to zero, the result is represented as RUN-RMWP. The simulation length of each task set is the hyperperiod of all tasks. 2 OP

stands for optional part.

All tasks are assigned to servers in RUN-RMWP and RUN or processors in P-RMWP by the worst-fit decreasing utilization heuristic. In order to achieve full system utilization, RUN-RMWP and RUN assign idle tasks to each server at reduction level 0 uniformly, as in the task set shown in Figure 5. If the utilization of server Sl at reduction level 0 exceeds one, the overutilization of server (i.e., Ulsrv −1) is reassigned to other servers at reduction level 0 uniformly until the remaining utilization of idle tasks becomes zero. The performance metrics are defined by the following equations. Success Ratio

=

# of Preemptions per Job

=

# of Migrations per Job

=

# of successfully scheduled task sets # of scheduled task sets 1  # of preemptions of task τi n i # of jobs of task τi 1  # of migrations of task τi n

i

# of jobs of task τi

If the success ratio of each system utilization is less than one, the results of the system utilization evaluated by other performance metrics are omitted. The reason why this paper evaluates the success ratio through RUN-RMWP is that the improvement of RUN-RMWP against other multiprocessor semi-fixed-priority scheduling algorithms with respect to the success ratio is evaluated. B. Simulation Results Figure 10 shows the success ratio of the simulation results. The success ratios of RUN-RMWP and RUN for all results are always one because they are optimal. In Figures 10(a), 10(b), and 10(c), G-RMWP drops its success ratio when the system utilization exceeds 0.35, 0.45, and 0.65, respectively. That is to say, G-RMWP can improve the success ratio if the ACET of each task is shorter than its WCET, thanks to its global scheduling. In contrast, P-RMWP generates the same results and always drops the success ratio when the system utilization exceeds 0.7, because partitioning scheduling (including P-RMWP) checks schedulability offline if each task is assigned to a processor. Figure 11 shows the simulation results for the number of preemptions per job. RUN-RMWP has similar numbers to RUN if the optional part of each task is not executed. RUNRMWP-OP has slightly more preemptions than RUN because the optional part of each task is executed, and hence there is a trade-off between preemption and QoS. The numbers of preemptions per job in RUN-RMWP, RUN-RMWP-OP, and RUN are small and at most 2.82, 3.38, and 2.52, respectively. Therefore, RUN-RMWP and RUN-RMWP-OP inherit the advantages of RUN with respect to smaller numbers of preemptions per job. P-RMWP performs as well as RUNRMWP when the system utilization does not exceed 0.7; however, when the system utilization exceeds 0.7, the success ratio of P-RMWP drops. In Figures 11(a), 11(b), and 11(c), if the ACET of each task is always equal to its WCET, RMWP-based algorithms have numbers of preemptions per job that are comparable to RUN. In contrast, if the ACET of each task is shorter than its WCET, the remaining processor time required to execute

the optional part of each task is increased, such that RMWPbased algorithms generate a larger number of preemptions per job than RUN because of the trade-off between preemption and QoS. G-RMWP gives results when the system utilization does not exceed 0.35, 0.45, and 0.65, respectively, and has the largest numbers overall results. When the ratio of ACET/WCET is [0.75, 1.0], the number of preemptions per job in RUN-RMWP-OP is the largest overall ratios of ACET/WCET because there are many opportunities for terminating the optional part of each task, incurring many preemptions. On the other hand, when the ratio of ACET/WCET is [0.5, 1.0], the number of preemptions per job in RUN-RMWP-OP is smaller than when the ratio of ACET/WCET is [0.75, 1.0] because there are many opportunities to complete the optional part of each task before it is preempted by higher priority tasks. Figure 12 shows the simulation results for the number of migrations per job. When the system utilization does not exceed 0.65, RUN-RMWP, RUN-RMWP-OP, and RUN are zero because assigning tasks to processors is successful. When the system utilization exceeds 0.65, the numbers of migrations per job are greater than zero but are at most 2.27, 2.24, and 1.24, respectively. Therefore, RUN-RMWP and RUN-RMWPOP also inherit the advantages of RUN with respect to small numbers of migrations per job. In Figures 12(a), 12(b), and 12(c), G-RMWP and GRMWP-OP show results when the system utilization does not exceed 0.35, 0.45, and 0.65, respectively. G-RMWP and GRMWP-OP have higher numbers of migrations per job than other algorithms. In particular, G-RMWP-OP is the highest because the optional part of each task is migrated and executed on different processors frequently because of global scheduling. Unlike Figure 11, when the ACET of each task decreases, the number of migrations per job in RUN-RMWP-OP also decreases because there are many opportunities to complete the mandatory and wind-up parts of each task well before it is migrated. From these results, RUN-RMWP outperforms G-RMWP and P-RMWP overall results. In addition, it has a slightly larger number of preemptions/migrations per job than RUN because it supports the extended imprecise computation model. In real systems, the additional overheads of imprecise computation using G-RMWP and P-RMWP are investigated and they are low and comparable to the overheads of G-RM and P-RM, respectively [15]. Therefore, this paper believes that the overhead of RUN-RMWP will be comparable to RUN in real systems. VII.

R ELATED W ORK

Optimal multiprocessor real-time scheduling has been achieved by the Pfair algorithm [6] that keeps execution times close to fluid scheduling. Pfair incurs significant run-time overhead because of the quantum-based scheduling approach. The practicality of Pfair was investigated by Brandenburg to compare PD2 [20], an extension of Pfair to reduce the number of preemptions/migrations. These algorithms were compared with other non-optimal multiprocessor real-time scheduling algorithms on Intel’s 24-core processors [21]. Experimental results show that PD2 has the worst schedulability of all

1

0.8

0.8

0.8

0.6 0.4 RUN-RMWP RUN G-RMWP P-RMWP

0.2 0 0.3

0.4

0.5

0.6

Success Ratio

1

Success Ratio

0.6 0.4 RUN-RMWP RUN G-RMWP P-RMWP

0.2 0 0.7

0.8

0.9

1

0.3

0.4

0.5

System Utilization

4

0.7

0.8

0.9

1

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.9

1

0.9

1

System Utilization

(c) ACET/WCET = [0.5, 1.0]

3

2

1

4

RUN-RMWP RUN-RMWP-OP RUN G-RMWP G-RMWP-OP P-RMWP P-RMWP-OP

3

# of Preemptions per Job

RUN-RMWP RUN-RMWP-OP RUN G-RMWP G-RMWP-OP P-RMWP P-RMWP-OP

# of Preemptions per Job

# of Preemptions per Job

0

0.6

(b) ACET/WCET = [0.75, 1.0]

0

2

1

0 0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

3

2

1

0.4

0.5

0.6

0.7

0.8

0.9

1

0.3

0.4

System Utilization

(a) ACET/WCET = 1.0

0.5

0.6

0.7

0.8

System Utilization

(b) ACET/WCET = [0.75, 1.0]

(c) ACET/WCET = [0.5, 1.0]

Number of preemptions per job

2.5

2.5

RUN-RMWP RUN-RMWP-OP RUN G-RMWP G-RMWP-OP

2

# of Migrations per Job

Fig. 11.

RUN-RMWP RUN-RMWP-OP RUN G-RMWP G-RMWP-OP P-RMWP P-RMWP-OP

0 0.3

System Utilization

# of Migrations per Job

RUN-RMWP RUN G-RMWP P-RMWP

Success ratio

4

1.5 1 0.5 0

2.5

RUN-RMWP RUN-RMWP-OP RUN G-RMWP G-RMWP-OP

2 1.5 1 0.5

0.4

0.5

0.6

0.7

0.8

System Utilization

(a) ACET/WCET = 1.0

0.9

1

RUN-RMWP RUN-RMWP-OP RUN G-RMWP G-RMWP-OP

2 1.5 1 0.5

0 0.3

Fig. 12.

0.4

System Utilization

(a) ACET/WCET = 1.0 Fig. 10.

0.6

0.2

# of Migrations per Job

Success Ratio

1

0 0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.3

System Utilization

0.4

0.5

0.6

0.7

0.8

System Utilization

(b) ACET/WCET = [0.75, 1.0]

(c) ACET/WCET = [0.5, 1.0]

Number of migrations per job

evaluated algorithms, and hence Pfair does not work well in real systems. There are some other optimal multiprocessor real-time scheduling algorithms: Largest Local Remaining Execution time First (LLREF) [7], EDF with task splitting and K processors in a group (EKG) [8], Deadline Partitioning Wrap (DPWrap) [9], and RUN [10]. Simulation results by Regnier et al. show that RUN outperforms LLREF, EKG, and DP-Wrap in the number of preemptions/migrations per job and scales well as the number of tasks/processors are increased [10]. There are imprecise-based real-time scheduling algorithms on uniprocessors including Mandatory-First with Earliest Deadline [22] and Optimization with Least-Utilization [23]. However, they support the imprecise computation model [12] and do not support the extended imprecise computation model [13].

Mandatory-First with Wind-up Part [13] and Slack Stealer for Optional Parts [24] were proposed to support the extended imprecise computation model on uniprocessors, but they do not support multiprocessors. G-RMWP [14] and P-RMWP [15] support multiprocessors in the extended imprecise computation model, but they are not optimal. In contrast, RUN-RMWP is optimal and supports the extended imprecise computation model, and hence RUN-RMWP has the advantage against GRMWP and P-RMWP. VIII.

C ONCLUSION

This paper proposed the new algorithm RUN-RMWP to achieve both optimal multiprocessor real-time scheduling and practical imprecise computation with harmonic periodic task sets. RUN-RMWP integrates RUN and RMWP to inherit the advantages of these algorithms that achieve the small number of preemptions/migrations per job and support the

extended imprecise computation model, respectively. RUNRMWP calculates the optional deadline of each task using RTA-OODH, which uses RMWP with harmonic periodic task sets. In addition, RUN-RMWP manages an optional deadline timer to terminate the optional part and release the windup part of each task in each server. Simulation results show that RUN-RMWP outperforms other non-optimal multiprocessor real-time scheduling algorithms, including G-RMWP and P-RMWP, with respect to schedulability and the number of preemptions/migrations per job. The number of preemptions/migrations per job in RUN-RMWP is slightly larger than that in RUN; however, RUN-RMWP supports the extended imprecise computation model, and hence RUN-RMWP is well suited to imprecise real-time applications such as humanoid robots. In future work, RUN-RMWP will be implemented in RTEst [25], a real-time operating system for semi-fixed-priority scheduling algorithms in the extended imprecise computation model. The overhead-aware schedulability of RUN-RMWP will be analyzed using the preemption-aware interrupt accounting method [21]. The integration of RMWP++ [26] and RUN-RMWP is interesting. In addition, RUN-RMWP will be adapted to the practical imprecise computation model [27] that supports multiple mandatory parts and the parallel-extended imprecise computation model [28]. ACKNOWLEDGEMENTS The author is thankful to Masayoshi Takasu for many helpful discussions. R EFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

I. Mizuuchi, Y. Nakanishi, Y. Sodeyama, Y. Namiki, T. Nishino, N. Muramatsu, J. Urata, K. Hongo, T. Yoshikai, and M. Inaba, “Advanced Musculoskeletal Humanoid Kojiro,” in Proceedings of the 2007 IEEERAS International Conference on Humanoid Robots, Nov. 2007, pp. 294–299. T. Taira, N. Kamata, and N. Yamasaki, “Design and Implementation of Reconfigurable Modular Robot Architecture,” in Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Aug. 2005, pp. 3566–3571. M. Fan and G. Quan, “Harmonic Semi-Partitioned Scheduling For Fixed-Priority Real-Time Tasks On Multi-Core Platform,” in 2012 Design, Automation & Test in Europe, Mar. 2012, pp. 503–508. V. Bonifaci, A. Marchetti-Spaccamela, N. Megow, and A. Wiese, “Polynomial-Time Exact Schedulability Tests for Harmonic Real-Time Tasks,” in Proceedings of the 34th IEEE Real-Time Systems Symposium, Dec. 2013, pp. 236–245. B. Andersson and J. Jonsson, “The Utilization Bounds of Partitioned and Pfair Static-Priority Scheduling on Multiprocessors are 50%,” in Proceedings of the 15th Euromicro Conference on Real-Time Systems, Jul. 2003, pp. 33–40. S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel, “Proportionate Progress: A Notion of Fairness in Resource Allocation,” Algorithmica, vol. 15, no. 6, pp. 600–625, 1996. H. Cho, B. Ravindran, and E. D. Jensen, “An Optimal Real-Time Scheduling Algorithm for Multiprocessors,” in Proceedings of the 27th IEEE Real-Time Systems Symposium, Dec. 2006, pp. 101–110. B. Andersson and E. Tovar, “Multiprocessor Scheduling with Few Preemptions,” in Proceedings of the 12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Aug. 2006, pp. 322–334. G. Levin, S. Funk, C. Sadowski, I. Pye, and S. Brandt, “DP-FAIR: A Simple Model for Understanding Optimal Multiprocessor Scheduling,” in Proceedings of the 22nd Euromicro Conference on Real-Time Systems, Jul. 2010, pp. 3–13.

[10]

P. Regnier, G. Lima, E. Massa, G. Levin, and S. Brandt, “RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor,” in Proceedings of the 32th IEEE Real-Time Systems Symposium, Nov. 2011, pp. 104–115. [11] C. L. Liu and J. W. Layland, “Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment,” Journal of the ACM, vol. 20, no. 1, pp. 46–61, 1973. [12] K. Lin, S. Natarajan, and J. Liu, “Imprecise Results: Utilizing Partial Computations in Real-Time Systems,” in Proceedings of the 8th IEEE Real-Time Systems Symposium, Dec. 1987, pp. 210–217. [13] H. Kobayashi and N. Yamasaki, “An Integrated Approach for Implementing Imprecise Computations,” IEICE Transactions on Information and Systems, vol. 86, no. 10, pp. 2040–2048, 2003. [14] H. Chishiro and N. Yamasaki, “Global Semi-Fixed-Priority Scheduling on Multiprocessors,” in Proceedings of the 17th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Aug. 2011, pp. 218–223. [15] ——, “Experimental Evaluation of Global and Partitioned SemiFixed-Priority Scheduling Algorithms on Multicore Systems,” in Proceedings of the 15th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, Apr. 2012, pp. 127–134. [16] H. Chishiro, A. Takeda, K. Funaoka, and N. Yamasaki, “Semi-FixedPriority Scheduling: New Priority Assignment Policy for Practical Imprecise Computation,” in Proceedings of the 16th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Aug. 2010, pp. 339–348. [17] T. P. Baker, “An Analysis of Fixed-Priority Schedulability on a Multiprocessor,” Real-Time Systems, vol. 32, no. 1-2, pp. 49–71, 2006. [18] N. C. Audsley, A. Burns, M. F. Richardson, K. Tindell, and A. J. Wellings, “Applying New Scheduling Theory to Static Priority Preemptive Scheduling,” Software Engineering Journal, vol. 8, no. 5, pp. 284–292, Sep. 1993. [19] J. P. Erickson and J. H. Anderson, “Reducing Tardiness under Global Scheduling by Splitting Jobs,” in Proceedings of the 24th Euromicro Conference on Real-Time Systems, Jul. 2013, pp. 14–24. [20] J. H. Anderson and A. Srinivasan, “Mixed Pfair/ERfair Scheduling of Asynchronous Periodic Tasks,” Journal of Computer and System Sciences, vol. 68, no. 1, pp. 157–204, 2004. [21] B. B. Brandenburg, “SCHEDULING AND LOCKING IN MULTIPROCESSOR REAL-TIME OPERATING SYSTEMS,” Ph.D. dissertation, The University of North Carolina at Chapel Hill, 2011. [22] S. K. Baruah and M. E. Hickey, “Competitive On-line Scheduling of Imprecise Computations,” IEEE Transactions on Computers, vol. 47, no. 9, pp. 1027–1032, Sep. 1998. [23] H. Aydin, R. Melhem, D. Mosse, and P. Mejfa-Alvarez, “Optimal Reward-Based Scheduling of Periodic Real-Time Tasks,” in Proceedings of the 20th IEEE Real-Time Systems Symposium, Dec. 1999, pp. 79–89. [24] H. Kobayashi and N. Yamasaki, “RT-Frontier: A Real-Time Operating System for Practical Imprecise Computation,” in Proceedings of the 10th IEEE Real-Time and Embedded Technology and Applications Symposium, May 2004, pp. 255–264. [25] H. Chishiro and N. Yamasaki, “RT-Est: Real-Time Operating System for Semi-Fixed-Priority Scheduling Algorithms,” in Proceedings of the 2011 International Symposium on Embedded and Pervasive Systems, Oct. 2011, pp. 358–365. [26] ——, “Zero-Jitter Semi-Fixed-Priority Scheduling with Harmonic Periodic Task Sets,” International Journal of Computers and Their Applications, vol. 22, no. 3, pp. 118–126, Sep. 2015. [27] ——, “Semi-Fixed-Priority Scheduling with Multiple Mandatory Parts,” in Proceedings of the 16th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, Jun. 2013, pp. 1–8. [28] H. Chishiro, “RT-Seed: Real-Time Middleware for Semi-Fixed-Priority Scheduling,” in Proceedings of the 19th IEEE International Symposium on Real-Time Computing, May 2016, pp. 124–133.

Optimal Multiprocessor Real-Time Scheduling based ...

Abstract—Optimal multiprocessor real-time scheduling can achieve full system utilization with implicit-deadline periodic task sets. However, worst case execution time (WCET) analysis is difficult on state-of-the-art hardware/software platforms because of the complex hierarchy of shared caches and multiprogram- ming.

995KB Sizes 2 Downloads 182 Views

Recommend Documents

A FPGA-based Soft Multiprocessor System for JPEG ...
2.2 Soft Multiprocessor System on Xilinx FPGA. We implement JPEG encoder on a Xilinx Virtex-II Pro. 2VP30 FPGA with Xilinx Embedded Development Kit. (EDK). For the entire system, including I/O, we use. Xilinx XUP2Pro board, with Compact Flash (CF) ca

Realtime Experiments in Markov-Based Lane Position Estimation ...
where P(zt) has the purpose of normalizing the sum of all. P(v1,t = la,v2,t = lb|zt). .... laptops was made through the IEEE 802.11b standard D-Link. DWL-AG660 ...

Realtime Experiments in Markov-Based Lane Position Estimation ...
C. M. Clark is an Assistant Professor at the Computer Science Depart- ment, California Polytechnic State University, San Luis Obispo, CA, USA ..... Estimated vs. actual lane positions for computer 1 (top) and computer 2 (bottom). be explained ...

Heuristic Scheduling Based on Policy Learning - CiteSeerX
machine centres, loading/unloading station and work-in-process storage racks. Five types of parts were processed in the FMS, and each part type could be processed by several flexible routing sequences. Inter arrival times of all parts was assumed to

Heuristic Scheduling Based on Policy Learning - CiteSeerX
production systems is done by allocating priorities to jobs waiting at various machines through these dispatching heuristics. 2.1 Heuristic Rules. These are Simple priority rules based on information available related to jobs. In the context of produ

Optimal Streaming of Layered Video: Joint Scheduling ...
We consider streaming layered video (live and stored) over a lossy packet network in order to maximize the .... We also show that for streaming applications with small playout delays (such as live streaming), the constrained ...... [1] ISO & IEC 1449

An Optimal Lower Bound for Anonymous Scheduling Mechanisms
An easy observation made by [13] is that the well-known VCG mechanism ... of at most m times the optimal makespan, while making it the dominant strategy ..... Figure 1: An illustration of an instance that is a ({1, ..., j},i)-projection of t, as in D

Optimal scheduling of pairwise XORs under statistical ...
Network Coding for Unicast Wireless Sessions: Design, Implementation, and Performance Evaluation,” in ACM SIGMETRICS, 2008. [3] E. Rozner, A. Iyer, ...

An Optimal Lower Bound for Anonymous Scheduling Mechanisms
scheduling algorithms are anonymous, and all state-of-the-art mechanisms ..... Figure 1: An illustration of an instance that is a ({1, ..., j},i)-projection of t, as in Def.

Optimal Scheduling of a Price-Taker Cascaded ...
a quadratic function of net head and water discharge. Several case studies of realistic dimensions are described, where results indicate that a profit-based MINLP produces better results compared to an MILP model, on the other hand, higher efficienci

Optimal scheduling of pairwise XORs under statistical overhearing ...
and thus extends the throughput benefits to wider topolo- gies and flow scenarios. ... focus on wireless network coding and study the problem of scheduling ...

An Optimal Lower Bound for Anonymous Scheduling Mechanisms
Mu'alem and Schapira [12] ...... each job independently using some non-affine-maximizer mechanism for single-dimensional domains. (those are abundant).

Optimal scheduling of pairwise XORs under statistical overhearing ...
the network layer stack, is what makes wireless NC important for practical ... and thus extends the throughput benefits to wider topolo- gies and flow scenarios.

Delay Optimal Queue-based CSMA
space X. Let BX denote the Borel σ-algebra on X. Let X(τ) denote the state of ..... where λ is the spectral gap of the kernel of the Markov process. Hence, from (3) ...

Mitigating Power Contention: A Scheduling Based ...
software [10], [12], [13] to mitigate its performance impact. In this paper, we argue that .... The Linux governor is set to 'performance' to avoid non-deterministic ...

Multiuser Scheduling Based on Reduced Feedback ...
Aug 28, 2009 - Relayed transmission techniques have the advantages of enhancing the ..... concepts for wireless and mobile broadband radio,” IEEE Trans.

Rule-based HCH Scheduling and Resource ...
... we present and analyze the simulation results. Finally section 5 provides concluding remarks. ————————————————. L. Mohammad Khanli, assistance professor, computer science Dept., university of Tabriz. F.davardoost i

Case Study of QoS Based Task Scheduling for Campus Grid
Also Grid computing is a system, which provides distributed services that integrates wide variety of resources with ... communication based jobs are like transfer a file from one system to another system and which require high ... Section 4 is relate

Case Study of QoS Based Task Scheduling for Campus Grid
Such Grids give support to the computational infrastructure. (access to computational and data ... Examples of Enterprise Grids are Sun Grid Engine, IBM. Grid, Oracle Grid' and HP Grid ... can be used. Here m represents the communicational types of t

On CDF-Based Scheduling with Non-Uniform User ...
the multi-cell network, each BS with CS selects the user having the largest ... In cellular networks, users located with different distances from a base ..... 1.425 × 10−4. N0 (dBm). -169. P0 (W). 4.3. W (MHz). 10. R (m). 300 we are ready to obtai

PRK-Based Scheduling for Predictable Link Reliability ...
receiver Ri from the transmitter Si and Sj respectively, Ni is the background noise power at receiver Ri, .... presence of interference from all concurrent transmitters, the probability for R to successfully receive packets .... ment study, we set th

A Graph-based Algorithm for Scheduling with Sum ...
I. INTRODUCTION. In a wireless ad hoc network, a group of nodes communicate ... In addition to these advantages, by analyzing the algorithm, we have found a ...

DRAM Scheduling Policy for GPGPU Architectures Based on a ...
Nov 22, 2011 - 1In NVIDIA architectures, 32 threads are executed together in lock-step as a warp and in AMD GPU architectures, 64 threads are executed together as a wave-front. blocking. For a warp blocked on memory accesses, all its memory requests

Study on Cloud Computing Resource Scheduling Strategy Based on ...
proposes a new business calculation mode- cloud computing ... Cloud Computing is hotspot for business ... thought is scattered through the high-speed network.