Managing Warm Containers for Cold Start Mitigation in Serverless Computing Applications

Research

Title	Managing Warm Containers for Cold Start Mitigation in Serverless Computing Applications
Type	Thesis
Keywords	Serverless Computing, Container Management, Cold Start Delay, Auto-Scaling, Workload Prediction, Request Scheduling, Quality of Service.
Year	2024
Researchers	Mohammed Ali Awla(Student)، Sadoon Azizi(PrimaryAdvisor)، Armin Choupani(Advisor)

Abstract

Serverless computing has transformed cloud-based and event-driven applications by adopting the Function-as-a-Service (FaaS) paradigm, which enhances abstraction from infrastructure, simplifies administration, enables flexible pay-as-you-go billing, and provides automatic scaling and resource optimization. However, the dynamic nature of workloads in serverless environments poses significant challenges for resource provisioning, particularly due to the unpredictability of workload demands. Key issues include managing containers through automatic resource scaling, request scheduling, and determining the idle container window. Addressing these challenges in a distributed environment with limited resources and unpredictable workloads is complex and necessitates intelligent solutions and optimization strategies. Recent research has shown that machine learning-based approaches for automatic resource allocation in dynamic environments often outperform traditional methods. Motivated by this, in this thesis, we propose an effective and efficient container management mechanism for dynamic resource allocation in serverless computing environment. The primary objective of this mechanism is to achieve high Quality of Service (QoS) for users while optimizing resource utilization for providers. The proposed mechanism comprises three components: (1) Utilizing the Gated Recurrent Unit (GRU) deep learning model to analyze historical function invocation patterns and accurately predict workload demands. Based on these predictions and a service quality-aware strategy, such as CPU utilization and the number of rejected requests, the propose mechanism dynamically adjusts the number of container instances required. (2) Utilizing the heuristic algorithm of the warmest instance for scheduling requests among active instances represents a significant advancement in optimizing resource utilization and minimizing latency in serverless computing environments. This algorithm intelligently prioritizes the most active or recently utilized instances that are still within their warm state. (3) Focusing on determining the optimal idle container window for maintaining hot containers after processing requests, with the aim of enhancing the efficiency of serverless computing environments. By precisely selecting the appropriate retention duration, the system can effectively balance the trade-off between minimizing cold start occurrences and conserving computational resources. We evaluate our approach using two real-world datasets provided by Microsoft Azure Functions. Our experiments compared the proposed mechanism against several fixed instance configurations, the default Kubernetes Horizontal Pod Autoscaler (HPA), and the Prediction-Based Autoscaler (PBA) algorithm. The experimental results indicate that the proposed approach significantly outperforms the baseline algorithms. Specifically, it reduces the number of cold starts, enhances CPU utilization, lowers memory usage costs, minimizes rejected requests, reduces energy consumption, and improves response times. These findings underscore the effectiveness of the proposed approach in managing dynamic workloads and improving the efficiency and responsiveness of serverless computing environments, and they demonstrate its potential for application in real-world dynamic settings.

Sadoon Azizi

Research

Abstract