Unavailable Time Aware Scheduling of Hybrid Task on Heterogeneous Distributed System

The resource allocation for tasks in heterogeneous distributed system is a well known NP-hard problem. For the sake of making the makespan is minimized, it is hard to distribute the tasks to proper processors. The problem is even more complex and challenging when the processors have unavailable time and the tasks type are various. This paper investigates a resource allocation problem for hybrid tasks comprising both divisible and bag-of-tasks(BoT) in heterogeneous distributed system when the processors has unavailable time. First, the mathematical model, which minimizes the makespan of the hybrid tasks when the processors have unavailable time, is established. Second, we propose a scheduling algorithm referred to as bag-of-tasks allocate-pull and divisible task allocation (BoTAPDTA) algorithm for handling hybrid tasks on heterogeneous distributed systems. In addition, to solving the optimization model efficiently, a generic algorithm(GA) is proposed. For the sake of reducing the search space and solving the optimization model effectively, a two step scheduling algorithm(TSGA), which first allocate bag-of-tasks(BoT) using generic algorithm and then assign divisible task to processors like BoTAPDTA, is designed. Finally, numerical simulation experiments are conducted, and experimental results indicate the effectiveness of the proposed model and algorithm.


I. INTRODUCTION
H ETEROGENEOUS distributed system has emerged as commonly systems for handling large scale scientific and commercial applications in various fields, such as image processing, signal processing, pattern matching in text, and many scientific computation problems [1], [2], [3]. For the sake of improving the performance of the system, many task scheduling algorithms for heterogeneous or homogeneous distributed system have been proposed in the past decades [4], [5], [6]. Wang [4] proposed a multi-objective bilevel programming model for energy and locality aware multi-job scheduling in heterogeneous system. Literature [5] Manuscript proposed two scheduling algorithms to schedule a BoT (bagof-tasks, BoT) on heterogeneous system so as to minimize the makespan and the energy consumption. In literature [7], the reliability cost, which is defined as the product of failure rate of processors and task processing time, is incorporated into scheduling algorithm for the tasks with precedence constraints on heterogeneous system. Lee and Zomaya classified the tasks into computation-intensive and data-intensive BoT task in literature [8] and presented two task scheduling algorithms in Grid computing system respectively. Anglano et al. [9] evaluated the performance of five knowledge-free task scheduling algorithms for scheduling multiple BoT in a desktop Grids computing system. In addition, the performance of several BoT scheduling solutions in large-scale distributed systems also have been studied in literature [10]. For BoT, there are some other studies that aim to maximize throughput by establishing linear programs or nonlinear programs [11], and the works focused on steady-state optimization problems and concentrated on numerous bag-of-tasks including independent and similar tasks. To schedule concurrent bag-of-tasks, the online and off-line scheduling algorithms are presented by Benoit et al. [11]. In literature [12], a decentralized scheduling algorithm, which minimizes the maximum stretch among usersubmitted tasks, is designed. Yang Y et al. [13] take the constraints of time, cost, and security into consideration, a scheduling algorithm for data-intensive tasks is designed. Literature [14] investigated both two problems: optimizing the makespan of the tasks under the constraints of energy, or minimizing energy consumption subject to makespan. However, this paper studied the static resource allocation to optimize makespan and energy robust stochastic for bag-oftasks(BoT) on a heterogeneous computing system. A multiobjective optimization model, which minimizes makespan and resource cost, is established in literature [15]. To solve the optimization model, a scheduling algorithm based on the ordinal optimization method is designed. However, the scheduling algorithm is inefficient when the task number or processing node number is large.
In scheduling theory, the fundamental assumption is that all processors which take participate in processing tasks are always available for processing tasks [16]. However, it might be unreasonable. If some certain maintenance requirements, breakdowns, or other constraints exits, they will make the processers unavailable for executing the tasks. In literature [17], availability is defined. For a processor, availability is defined as the ratio of the total available time to the total time during a given interval. In previous work, Some work has investigated task scheduling algorithm with processor availability constraints [18]. Adiri et al [19]. investigates the scheduling problem with availability constraints in a single machine system. For minimizing maximum lateness of the n jobs, literature [20] studied the problem on homogeneous machines under machine availability and eligibility constraints. A branch-and-bound method is proposed in literature [21] to solve the single-machine scheduling problem with machine availability constraints. For minimizing the total flow time, literature [22] investigated the non-permutation flow shop scheduling issue with the learning effects and machine availability constraints. To minimize the makespan, the two-machine permutation flowshop scheduling problem with an availability constraint is investigate in literature [23]. However, the basic assumption of this work is that the availability constraint imposed only on the first machine. Literature [24] developed a Hybrid Heuristic-Ant Colony Optimization (H2ACO) for multiclass tasks on heterogeneous distributed systems with availability constraint. H2ACO algorithm can make a good trade-off between availability and makespans of the tasks. An availability-aware scheduling model is investigated in literature [25], and an optimization algorithm to increase the availability and to minimize the makespan of tasks in heterogeneous systems is proposed. Literature [26] proposed a quantum-behaved particle swarm optimization algorithm to optimize the availability-aware task Scheduling on heterogeneous systems. A novel distributed availability-aware adaptive rate-allocation scheduling algorithm for multimedia tasks in heterogeneous wireless networks is proposed in literature [27].
Divisible task has been studied extensively in the last several decades, resulting in a cohesive theory called Divisible Load Theory (DLT). In our work, we investigates a resource allocation problem for hybrid tasks comprising both divisible and bag-of-tasks in heterogeneous distributed system when the processors have unavailable time. The major contributions of this study are summarized as follows: • To minimize the makespan of the hybrid tasks, a mathematical optimization model, which takes unavailable time constraint of processors into consideration, is established. • We propose a algorithm referred to as Bag-of-Tasks Allocate-Pull and Divisible Task Allocation (B-oTAPDTA) algorithm for the hybrid tasks scheduling problem. • To solving the optimization model effectively, a generic algorithm(GA) is proposed. • For the sake of reducing the search space and solving the optimization model effectively, a two step scheduling algorithm(TSGA), which first allocate bag-of-tasks using generic algorithm and then assign divisible task to processors like BoTAPDTA, is designed. • An analysis on the effectiveness of our proposed algorithm on two different size systems that vary in both number of processors and tasks. The rest of this paper is organized as follows. Section II gives the system and task description, and the mathematical model is established. The scheduling algorithm referred to as bag-of-tasks allocate-pull allocate-pull scheduling (BoTAP) algorithm is described in section III. Section IV proposed a generic algorithm to solving the optimization model effectively. The two step hybrid tasks scheduling algorithm is explained in section V. Section VI presents simulation results to evaluate the algorithms. The paper is concluded with a summary and a future work in Section VII.

A. System and Task Description
In our work, the heterogeneous distributed system has N + 1 processors, which includes a master processor and N slave processors. P 0 denotes the master processor, and the slave processors denoted by {P 1 , P 2 , · · · , P N }. Each slave processor P i (i = 1, 2, ···, N ) is associated with a speed index w i , which is the time taken to process a unit workload on processor P i . Slave processor is the most basic processing unit in our research. Since some reasons, such as shutdown or maintenance requirements, slave processors have some unavailable time. [a j i b j i ](i = 1, 2, · · ·, N ; j = 1, 2, · · ·, n i ) denote the j th (j = 1, 2, · · ·, n i ) unavailable segment of processor P i (i = 1, 2, · · ·, N ), and n i is the number of unavailable segment for processor P i (i = 1, 2, · · ·, N ). For the convenience, the j th (j = 1, 2, · · ·, n i ) available segment of processor P i (i = 1, 2, · · ·, N ) denoted by [c j i d j i ](i = 1, 2, · · ·, N ; j = 1, 2, · · ·, m i ), and m i is the number of available segment for processor P i (i = 1, 2, · · ·, N ). To understand easily, the system model is shown in In our study, the hybrid tasks comprising both divisible and bag-of-tasks is investigated. Bag-of-tasks(BoT) are a representatively type of tasks including of numerous independent tasks and can be processed parallelly without communication. Divisible task(DT) can be partitioned into a large number of load fractions and can be processed independently on the processors in parallel since there are no precedence relationships among these tasks. That is to say, all the tasks our work investigated are independent. The workload includes N τ + 1 independent tasks and the i th (0 ≤ i ≤ N τ ) task is denoted by τ i , where task τ 0 is a divisible task, and τ i (1 ≤ i ≤ N τ ) are bag-of-tasks. Following the previous studies [28], [29], we assume that the size of load τ σ i (0 ≤ i ≤ N τ ) is known after a task arrives according to the prediction mechanisms such as code profiling and statistical prediction. As the previous work [30], these bag-of-tasks have different computing requirements, and we assume that each task can only be processed by some specific processors. Ω i is a set of the processor's that τ i can allocated to. In our work, we assume that the divisible τ 0 can be processed on all processors in the heterogeneous distributed system. Similarly, we assume that the bag-oftasks are computation-intensive as prior works [31]. That is to say, the time consuming of input data transmission does not influence much the completion time and hence it can be negligible. In our work, the transmission of divisible task is negligible too.

B. Mathematical Modeling
The task scheduling problem investigated in this paper is to schedule all the N τ + 1 tasks to the N processors in the heterogeneous distributed system with the purpose of minimizing the makespan of the tasks. Then, we will give the mathematical modeling of the optimization problem.
1) Objective Function: Generally speaking, makespan is the latest finish processing time of the processors. If T i denote the finish processing time of processor P i , the makespan T of the tasks can be denoted by Eq.(1).
In our work, the purpose is minimize the makespan of the hybrid tasks in the heterogeneous distributed system with unavailable time considered. So, the objective function can be described as Eq. (2).
As shown in Eq.(1), we can see that the finish processing time of each processor should be calculated. For a specific processor P i , its processing time diagram is shown in Fig.2. In the processing time diagram of processor P i , it shows that the finish time of processor P i determined by the last segment of available time which has tasks assigned to. If l(1 ≤ l ≤ m i ) and δ l i denote the last segment of the tasks assigned to processor P i and the set of BoT tasks assigned to segment lst respectively. α l i is the ratio of the fraction size assigned to the l segment on processor P i to the workload τ σ 0 . So, the finish processing time of processor P i can be calculated by Eq. (3).
Then, we can rewrite Eq.(2) as Eq.(4) 2) constraint conditions: (a)All the BoT must be assigned to the processors that can satisfy BoT requirement: Θ = (θ k ij ) Nτ ×N ×mj is a binary matrix, where θ k ij = 1 if only and if task τ i is assigned to the k th (1 ≤ k ≤ m j ) segment available time of processor P j (1 ≤ j ≤ N ), otherwise θ k ij = 0. So, we can obtain a conclusion: If θ k ij = 1, the processor P j of task τ i (1 ≤ i ≤ N τ ) assigned to must in the set Ω i . That is to say, Eq.(5) is satisfied.
(b)All the BoT must be allocated to assigned to the available time segments on processors: A crucial principle of bagof-tasks scheduling problem is that all the tasks should be allocated to the suitable processors. According the definition of the binary matrix Θ, we can obtain Eq.(6) when all the tasks are allocated.
(c)All the workload of divisible task must be assigned to suitable processors: α j i denotes the ratio of the fraction size assigned to j th ( to the entire workload of the divisible task τ 0 . If the divisible task τ 0 is assigned Completely, Eq.(7) can be obtained.
(d)The execution time of the tasks assigned to a available segment should not be greater than the available time of the segment: Since each processor P j (1 ≤ j ≤ N ) has unavailable time, the processing time of tasks assigned to k th segment should not be greater than the available time.
The workload of assigned to k th (1 ≤ k ≤ m j ) segment available time on processor P j (1 ≤ j ≤ N ) denoted by σ k j , the Eq.(8) should be satisfied as shown below.
There are BoT and divisible task fraction allocate to and α k j , we can can calculate the workload σ k j of assigned to k th segment available time on processor P j by Eq. (9).
Then, we can rewrite Eq.(8) as Eq.(10) 3) Mathematical Modeling: The task scheduling problem investigated in this paper is to schedule all the N τ + 1 tasks to the N processors in the heterogeneous distributed system with the purpose of minimizing the makespan of the tasks. In section II-B1 and section II-B2. we give the mathematical formulation of the objective function and IAENG International Journal of Applied Mathematics, 50:1, IJAM_50_1_20 Volume 50, Issue 1: March 2020 ______________________________________________________________________________________ constraints respectively. Then, the mathematical optimization model with constraints are presented in Eq. (11).
(11) In this optimization model, constraints (a)-(d) has description in section II-B2, and constraints (e) gives the scope of parameters i, j, k. To solve this global optimization model, an algorithm referred to as bag-of-tasks allocate-pull and divisible task allocation (BoTAPDTA) algorithm and generic algorithm with a local research strategy are proposed. The algorithm of BoTAPDTA will be described in section III. The proposed generic algorithm and two step scheduling algorithm(TSGA) will be given in section IV and section V respectively.

III. PROPOSED BOTAPDTA ALGORITHM
In our work, hybrid tasks comprising bag-of-tasks(BoT) and divisible task scheduling problem is investigated. In addition, unavailable time is taken into account. Since divisible can be divided into arbitrary fractions and proposed on any processor in heterogeneous system, it has a great flexibility. We first allocate BoT to suitable processor, and then the divisible task assigned to the time slot before T BoT and available time after T BoT as shown in Fig  Based on analyzing, the scheduling algorithm can be divide into two procedures: (a)Allocation the BoT to suitable processors; (2)Allocation the divisible task to suitable time slot and available time segment. For the sake of understanding the algorithm macroscopically, the framework of the scheduling algorithm is presented before presenting the detailed steps of the algorithm. The bag-of-tasks allocate-pull and divisible task allocation(BoTAPDTA) algorithm is shown in Algorithm 1.
Step 1 is sort all the tasks τ i (i = 1, · · · , N τ ) in an descending order according to the workload τ σ i of tasks; Step 3 to step 8 is allocate the tasks to available time segments on processors. For minimizing the makespan, step 10 to step 25 is pull the tasks to available time segment before the segment current them allocated to.
Step 28 to step 33 is allocate the divisible task to the processors.

Algorithm 1: The algorithm framework of BoTAP
Input: Output: a schedule scheme; 1 Put all the tasks into a task queue T Q, and sort them in an descending order according to the workload τ σ i of tasks τ i (i = 1, · · · , N τ ); 2 Allocation: 3 while T Q is not empty do 4 Take out the first task in T Q, and denote it as τ head ; 5 Select a processor P head in Ω head to make the current finish processing time is minimum; 6 Select a available time segment [c s head , d s head ] on processor P head and assign task τ head to it. 7 Update available time, c s head = c s head + τ σ head w head ; 8 end 9 Pull: 10 for i = 1 to N do 11 while l > 1 do 12 %l is the last segment of available time which has tasks allocated to. 13 Put all the tasks in l th segment into a task queue subT Q l , and sort them in an descending order according to the workload; 14 while subT Q i is not empty do 15 Take out the first task and denote it as τ subhead ,f lag = 0, k = 1; Update available time, ≥ τ σ 0 then 29 %ϕ k j and µ j are the k th time slot and number of time slot before T BoT on processor P j ; In the algorithm of BoTAP, we first sort the tasks in descending order according to the workload of tasks. We will give the reason that we choose descending order as follow: Case 1: As shown in Fig.4, suppose two tasks τ i and τ j are all allocated to processor P k , and the workload of tasks τ i and τ j satisfy τ σ i < τ σ j . In addition, tasks τ i and τ j can executed in the first available time segment on processor P k respectively, but τ i and τ j can not executed in the first available time segment simultaneous. As shown in Fig.4(b), if τ i executed before τ j , we should allocate τ i to the first available time segment and allocate the τ j to the second segment. So, the makespan of the two tasks is Fig.4(d) and Fig.4(e), tasks τ i and τ j can executed in the first available time segment on processor P k simultaneous. If τ i executed before τ j as shown in Fig.4(d), the makespan of the two tasks is Fig.4(e). So, we can obtain T a k = T b k . From the above, we can know that allocation order will effect the makespan of the tasks. If larger workload task allocated first, the makespan of tasks will equal to or shorter than that obtained by smaller workload task allocated first.

B. Processor Selection in Allocation
When the task queue T Q is not empty, the first task is taken out and allocated to the processor. Since the objective is minimize the makespan of the tasks, so we must allocate the task on a processor that can be make the makespan is minimum. In our work, Eq.(12) and Eq.(13) is used to determine the processor that task τ i should be allocated to.
where P np is the processor in processors set Φ i that the task τ i can be allocated to, and np proper is the proper processor determined. Eq. (12) is to find the set Φ i that the task τ i can be allocated to in Ω i . The processors in Φ i should satisfy two conditions: (1) the processors should in the set Ω i . (2) At least a available time segment, which can execute the task τ i punctually, exists.

C. Segment Selection in Allocation
After a proper processor determined, we should allocate task τ i to a optimal available time segment on processor P np . A excellent strategy that allocate task to a available time segment will help to minimize makespan of the tasks. we use Eq. (14) and Eq. (15) to determine which segment should task τ i allocate to. ns proper = arg min Eq. (14) is used to find some available time segments that can complete task τ i in time on processor P np . For the sake of decreasing time debris which can not complete any task in time, the shortest time segment in N S is selected. There are two tasks τ 1 and τ 2 allocated on processor P k , and τ σ 1 > τ σ 2 . τ 1 can completed in segment 1 and 2, and τ 1 can completed in segment 2 and 3. Because the tasks in task queue T Q are sorted in descending order according workload, τ 1 is allocated first before τ 2 . For τ 1 , since d 1 k − c 1 k < d 2 k − c 2 k , so τ 1 is allocate to segment 1 according to Eq.(15) as shown in Fig.5(b). Though segment 2 can complete τ 2 in time and segment 2 is before segment 3, we can see that τ 2 is allocated to segment 3 from Fig.5(c). This strategy can help to decrease time debris and increase the utilization of the available time segment.

D. The Strategy of Pull
After the process of allocate, all the tasks are allocated to the available time segments on processors. However, some available time segments are exit because the strategy that described in section III-C. As shown in Fig.6(a), τ i is allocated to (j + 1) th available time segment processor P k , so the processing finish time of P k is T k = c j+1 k + τ σ i w k . Since the j th available time segment can complete task τ i in time, we can pull task τ i from (j + 1) th available time segment to j th available time segment as shown in Fig.6(b). So the processing finish time of P k can be denoted as T ′ k = c j k + τ σ i w k . T ′ k < T k can be obtained intuitively. So, the strategy of pull tasks to another available time segment can decrease the processing finish time of processors. For the sake of decreasing the processing time of the processors as much as possible, we should solve following two problems:(1)which task should be pull to the objective segment? (2) which segment should be selected as the objective segment? These two issues will be tackled in section III-D1 and section III-D2.
1) Selection of Objective Task: In this paper, we investigate the bag-of-tasks scheduling problem, and the objective of scheduling algorithm is minimize makespan of thed tasks. The makepan of the tasks is determined by the processing finish time of all processors in the heterogeneous computing system. From Eq.(3), we can see that the processing finish time T i (1 ≤ i ≤ N ) is depended on the tasks completed time in last available time segment l which allocate tasks on processer P i . Suppose SubQ l is the tasks set which allocate to the last available time segment l. Eq.(16) is used to determine the task which should be pull to another available time segment. 2) Selection of Objective Segment: To decrease the processing finish time of processors, the strategy, which pulls a task to another segment, is designed. In this strategy, two problems should be solved. The problem of objective task determined has been tackled in section III-D1. In this section, we will solve the other question. For the sake of guaranteeing the task τ nt completed in time, the segment ns pro is selected according to Eq. (17).
The strategy of pull task to another segment can decrease processing finish time as much as possible. First, the task τ ntpro with largest workload is selected according to Eq.(16), and a available time segment ns pro is selected according to Eq. (17). If ns pro = ∅, let SubQ l = SubQ l \ { τ ntpro } , and then another task τ ntpro is selected according to Eq. (16). If ns pro ̸ = ∅, We pull the task τ ntpro from segment l to the segment ns prop .
A example is presented in Fig.7. Task τ a and τ b are allocated to the l th available time segment, and τ σ a > τ σ b . First, τ a and τ b are put into subQ l . According to Eq.(16), task τ a is selected as the objective task. The i th segment is selected as the objective segment according to Eq. (17). Then, we pull τ a to i th segment and update c i k = c i k + τ σ a w k . Since ns pro ̸ = ∅, we can select task τ b and j th segment as the objective task and objective segment respectively. Then, task τ b is pulled to j th segment and update c j k = c j k + τ σ b w k . Let l = l − 1, a new round of pulling is conducted until the objective segment can not found.

E. Allocation Divisible Task
After allocate the BoT to the processors, we should allocate the divisible task to the time slot and available time segment to make the makespan of the hybrid tasks minimized. We can see that scheduling of the divisible task can be divided into two situations from algorithm 1. If Eq.(18) is satisfy, the divisible task is called large divisible task, otherwise, the divisible task is called small divisible task. Then we will scheduling the small or large divisible task according to different strategy which described in section  III-E1 and section III-E2.
where ϕ k j and µ j are the k th time slot and number of time slot before T BoT on processor P j .
1) Small Divisible Task Scheduling: If the divisible task is regarded as a small task, it means that the divisible task τ 0 can be completed in the time slot before T BoT . So, we can allocate the divisible task in the time slot as shown in Fig.8(a). In this case, we should solve two problems: (1)which processor should the divisible task allocate to? (2) which time slots in the processors should the divisible task allocate to? In our work, the method as follow is proposed to solve the two problems aforementioned. First, w j (∑ µj k=1 ϕ k j ) (j = 1, 2, · · · , N ) are sorted in descending order and put them into a array Γ. Then, the processors are determined by Eq. (19).
If ∑ np j=1 Γ j = τ σ 0 , we will assign the divisible task onto the time slot on processors before Γ np . Otherwise, we will assign the divisible task onto the time slot on processors before Γ np−1 , and the reminder workload (τ σ 0 − ∑ np−1 j=1 Γ j ) will allocate to processor Γ np−1 . In this case, the makespan of the hybrid tasks is T BoT .
2) Large Divisible Task Scheduling: If the divisible task is regarded as a large task, it means that the divisible task τ 0 can not be completed in the time slot before T BoT . In this case, we first allocate all the time slot before T BoT . Then, the reminder workload will allocate to the available time after T BoT as shown in Fig.8(b). An essential condition used in the related works in DLT(Divisible Load Theory, DLT) to derive optimal solution is as follows: for the sake of obtaining an optimal processing time, it is necessary and sufficient to demand that all the processors participating in the computation must finish their computing at the same time. So, we have  .
The makespan of hybrid tasks can be calculated easily through Eq. (20). s i denotes the unavailable time between T BoT and T DT of processor P i , and s i is related to T DT , so it is impossible to determine s i when T DT is not determined. For the sake of searching the T DT , a binary search algorithm is proposed and its pseudocode is shown in algorithm 2.

IV. GA FOR HYBRID TASKS SCHEDULING
Task scheduling is a NP hard problem in the well-known hardest combinatorial optimization problems. GA(generic algorithm, GA), which invented by is employed to solve the task scheduling optimization model proposed in section II-B. GA is a efficient technique for many realistic application problems such as Control and Decision, image processing, and machine learning, etc [32], [33].

A. Encoding and Population Initialization
Generally speaking, a suitable encoding scheme, which encodes the solutions in problem domain to a chromosome, is much more significant. A better encoding scheme will make the search easier by limiting the search space and converge to the global optimal solution rapidly. Based on what characterizes this optimization model for hybrid tasks scheduling problem, the encoding scheme of integer is adopted for bag-of-tasks. A array Calculate unavailable time s i (i = 1, 2, · · · , N ) between T BoT and T BoT + mean time for every P i ; 6 The fractions β i = (mean time − s i )/w i that can be allocated to P i is calculated; We can obtain the initial population P op BoT and P op DT of generic allocate according to the algorithm 3.

B. Crossover Operator
For the sake of increasing diversity of the individuals in population, a crossover operator for divisible task(DT) coding presented in algorithm 4 respectively. To decrease the makespan of the tasks, the divisible load should be assigned to the former segment of the processors. So, two weight coefficients, which decrease gradually, are used in step 11 and 12.

C. Mutation Operator
Mutation, which can change some gene values in a parent individual to a new state, is a genetic operator. A better mutation operator can produce entirely novel offspring individuals and improve diversity of the population. With these new individuals, the genetic algorithm may obtain a better solution than previously one possible. Mutation is an essential operator in generic algorithm, and it can help to make the populations escape the local optimum. Suppose that the chromosome C = (c ij ) 2×Nτ and where np is the processor number of a tasks allocated to and ns is the available time segment on processor P np . Eq.(21) is used to search a best processor P np best and available time segment ns better on processor P np best for tasks τ i to make the maximum processing finish time of the processors in Ω i is minimum. An individual C 2 is selected in the neighborhoods neiber(C 1 ) of individual C 1 ; 3 else 4 An individual C 2 is selected in the population except for the neiber(C 1 ); 5 end 6 O 1 = C 1 , O 2 = C 2 ; 7 for i = 1 to N do 8 m i numbers are generated randomly, and put them into W according to descend order.

D. Local Search Operator
Local search is an important operator in generic algorithm, and it can help to jump out the local optima. In this paper, a local search operator, which can accelerate the convergence and enhance the searching ability of the proposed algorithm, is designed. If the local search operator applied to the chromosome C = (c ij ) 2×Nτ and C D = (c D ij ) N ×Ng , the offspring C ′ = (c ′ ij ) 2×Nτ and C D ′ = (c D ′ ij ) N ×Ng are obtained by local search operator as shown in algorithm 7.
where np is the processor number that the tasks is allocated to. Eq.(22) is used to search a better available time segment ns better in processor P np .

E. Modify Operator
For the sake of accelerating the convergence of generic algorithm and minimizing makespan of the tasks, a modify ; 20 end Algorithm 7: Local search operator Two integer j 1 , j 2 and a random 0 < R < 1 are generated randomly, and 1 ≤ j 1 < j 2 ≤ m i ; operator is designed. The pseudocode of the modify operator is shown in algorithm8. Chromosome C = (c ij ) 2×Nτ and C D = (c D ij ) N ×Ng will modified as C ′ = (c ′ ij ) 2×Nτ and C D ′ = (c D ′ ij ) N ×Ng according to the modify operator.

V. TSGA FOR HYBRID TASKS SCHEDULING
To solve the optimization model, a generic algorithm is designed in section IV. In GA, the allocation scheme of BoT and divisible task are both determined, a large search space exits. So, we propose a two step scheduling algorithm to minimizing makespan of the hybrid tasks. We first scheduling the bag-of-tasks using GA proposed in section IV, and then the divisible task is scheduled through the methods described i is the tasks set allocated to j th available time segment of processor P i , and the tasks sorted descending order according to workload; A. Parameters Value 1) Tasks parameters: In this paper, we investigate the hybrid tasks scheduling problem in heterogeneous distributed system. In the experiments, the workload of the BoT ranges from 1000 to 7000. In simulation system, the tasks number N τ is varying between 50 and 500. In addition, the processors set Ω i (1 ≤ i ≤ N τ ), which can process task τ i (1 ≤ i ≤ N τ ), is generated as follow: n r is a random number in (0, 1], n pro = ⌊n r N ⌋. Then, n pro processors are selected in P and put them in Ω i . In our experiments, the workload of the divisible task is generated randomly in [ .

2) System parameters:
In this section, we show our simulation system parameters. The simulation system has 20 heterogeneous slave processors, and the time consuming for unit workload w s i (1 ≤ i ≤ N s ) of P i (1 ≤ i ≤ N s ) in the heterogeneous distributed system is referred to Shang [34], where N s denote the number of processors in simulation system. The available time segment m i on processor P i is generated randomly in [5,12]. The length . If k = m i , d k i can equal to +∞, that is to say, the processor P i can processing tasks all the time in m th i available time segment. 3) Generic algorithm parameters: In this paper, GA denotes the algorithm of generic algorithm without local search operator, and generic algorithm with local search and modify operator denoted as GALM. Similarly, GAL and GAM are indicated as the generic algorithm with local search and modify operator proposed in section IV respectively. TSGA denotes the algorithm proposed in section V without local search operator, TSGAL, TSGAM and TSGALM are denote the two step scheduling algorithm with local search or modify operator. In the algorithm of GA, GAL, GAM, GALM, TSGA, TSGAL, TSGAMA and TSGALM, the following parameters are chosen: population size P op size = 100, crossover probability p c = 0.8, mutation probability p m = 0.05, elitist number E = 5 and maximum iterations G max = N τ . Since the concept of neighborhoods of a individual is employed, neighbor size T = 10 in our experiments.

B. Simulation Results
As there is no algorithm available in the literature for scheduling hybrid task in heterogeneous system with unavailable time segment constraints. To evaluate the effectiveness of the proposed scheduling algorithm, we first will present a performance evaluation study in simulation system. First, we will evaluate the nine algorithms (BoTAPDTA, GA, GAL, GAM, GALM, TSGA, TSGAL, TSGAM and TSGALM) on makespan for various tasks number (N τ ) and workload size, and the makespan obtained by the five algorithms are shown in Fig.9 and Fig.10.
To evaluate the convenience of the eight algorithms(GA, GAL, GAM, GALM, TSGA, TSGAL, TSGAM, T-SAGALM), convergence performance of the four algorithms are shown in Fig.11 and Fig.12. In this experiment, a specific number of task is selected, N τ = 50, and the maximum iterations G max = 1000 in every group experiment.
What is more, we evaluate the robustness of the eight algorithm(GA, GAL, GAM, GALM, TSGA, TSGAL, TS-GAM, TSAGALM). In the experiments, every algorithm is executed 30 times independently. Workload of the bag-tasks are uniform distribution in [1000 5000], and the number of task ranges from 50 to 500. The statistics results of the four algorithms are shown in Fig.13 to Fig.16

C. Experimental Analysis
The makespans obtained by the four algorithms are shown in Fig.9 and Fig.10. Proposed algorithms can obtain a better scheduling strategy according to all the information and the state of the processors. Since local search operator and modified operator are tailor-made, both of them are conducive to increasing the diversity of solutions and searching a local optimal solution in search space. So, TSGAL and GAM can convergent to a better solution than TSGA. That is to say, makespan obtained by TSGAL and TSGAM both are smaller than that obtained by GA. TSGALM is a algorithm that local search and modified operator are added into TSGA. So, makespan obtained by GALM is smallest. However, TSGAL is hard to tell from TSGAM. Because local search operator and modified operator both are search a local optimal solution by changing scheduling scheme of a task. As shown in the Fig.9 and Fig.10, we can see that makespan obtained by TSGALM is smallest, and makespan obtained by TSGA is largest among the eight algorithms. Makespan obtained by TSGAL is smaller than that obtained by GAM in some cases. But, the opposite results can be obtained in other cases.
In additional, convergence of proposed eight algorithms are investigated in a simulation system. In this paper, we design a local search operator and a modify operator, and the two optimization algorithms referred as TSGAL and GAM. Local search operator and modified operator are conducive to increasing the diversity of the solutions and searching a local optimal solution in search space. On the one hand, local search operator can generate a better offspring than its parent individual by changing the value of a gene. On the other hand, modified operator can also decrease the processing finish time of a processor as much as possible. For a specific generation, the offsprings obtained by local search operator and modified operator will have better fitness than their parents. So, TSGAL and TSGAM can convergent to global optimal solution quickly. That is to say, TSGAL and TSGAM have a higher convergent speed than GA. However, we can not tell good or bad for GAL and GAM. As we can see in the experimental results, TSGAL is better than TSGAM in some cases, and TSGAM is better than TSGAL in others case. Because local search operator and modified operator are both searching a local optimal solution by changing scheduling scheme of a task. GALM is a algorithm that comprise of TSGA, local search operator and modified operator. So, it can convergent to a global optimal solution as fast as possible. As shown in the Fig.11 and Fig.12, TSGALM has a highest convergent speed among the four algorithms (TSGA, TSGAL, TSGAM and TSGALM), and the convergent speed of TSGA is lowest. What is more, the robustness of the four algorithms are investigated in simulation system. Fig.13 to Fig.16 give the robustness of TSGA, TSGAL, TSGAM and TSGALM in simulation system for various task number. From the figures, we can see that the four algorithms have a high robustness for various task number. With increasing of the tasks and processors, much more local optimal solutions exit. So, the robustness of the four algorithms are much higher in a simulation system for the same task number.

VII. CONCLUSION
In this paper, we investigate a hybrid tasks comprising both bag-of-tasks(BoT) and divisible tasks scheduling problem with unavailable time considered in heterogeneous distributed system. For the sake of minimizing the makespan of the tasks, a mathematical optimization model with the unavailable time constraint is established. A hybrid scheduling algorithms with local search or modified operator are designed. Makespan obtained by the proposed algorithms with various number of tasks and workload are evaluated. In addition, convergence and robustness of eight algorithm (TSGA, TSGAL, TSGAM and TSGALM) are evaluated in the simulation system. Experimental results show that the proposed algorithms are efficiency.