Journal of Xidian University ›› 2025, Vol. 52 ›› Issue (1): 181-195.doi: 10.19665/j.issn1001-2400.20241005

• Computer Science and Technology & Cyberspace Security • Previous Articles     Next Articles

Multi-workflow fault-tolerant scheduling strategy for WaaS platforms

ZHI Wentao1,2(), ZHAO Hui1,2,3(), MENG Fanxin1(), WANG Jing1(), WAN Bo1,2,3(), WANG Quan1,3()   

  1. 1. School of Computer Science and Technology,Xidian University,Xi’an 710071,China
    2. Hangzhou Institute of Technology,Xidian University,Hangzhou 311231,China
    3. Shaanxi Province Key Laboratory of Smart Human-Computer Interaction and Wearable Technology,Xi’an 710071,China
  • Received:2024-01-06 Online:2024-10-24 Published:2024-10-24
  • Contact: WANG Jing E-mail:22031212306@stu.xidian.edu.cn;hzhao@mail.xidian.edu.cn;20031211543@stu.xidian.edu.cn;wangjing@mail.xidian.edu.cn;wanbo@xidian.edu.cn;qwang@xidian.edu.cn

Abstract:

As the complexity of scientific computation increases,workflows have become an essential model for automating scientific computations.Workflow as a Service(WaaS) platforms rent virtual machines from Infrastructure as a Service(IaaS) providers to offer users the service of running scientific workflow computations.However,current researches on workflow scheduling in WaaS platforms do not consider the potential for virtual machine downtime to lead to task failures and the delays in virtual machine provisioning.To address this issue,this paper proposes a multi-workflow fault-tolerant scheduling strategy for WaaS platforms.First,considering that WaaS platforms do not schedule hardware resources but operate at the level of virtual machines and containers,we establish a workflow scheduling model suitable for WaaS platforms,taking into account the impact of virtual machine provisioning delays on scheduling.Second,we propose a multi-workflow fault-tolerant scheduling strategy for WaaS platforms,which includes preprocessing,fault-tolerance selection method,task scheduling,and resource adjustment.This involves designing an improved deadline division algorithm for determining the scheduling order,creating a fault-tolerance selection algorithm that combines replication and resubmission,considering task attributes and virtual machine provisioning delays for virtual machine selection and task allocation,and designing a resource adjustment algorithm for avoiding the waiting time for the provisioning delay of virtual machines or containers by deploying resources in advance for the upcoming tasks.Finally,by comparing the proposed scheduling strategy under different virtual machine downtime probabilities,workloads,and deadlines with other algorithms,we demonstrate the effectiveness of the proposed fault-tolerant scheduling strategy for WaaS platforms.

Key words: multi-workflow, fault tolerance scheduling algorithm, WaaS platforms, resource provisioning delay

CLC Number: 

  • TP301.6