Homepage: https://acmsocc.org/2023/
Paper list: https://acmsocc.org/2023/accepted-papers.html
- Lifting the Fog of Uncertainties: Dynamic Resource Orchestration for the Containerized Cloud [Paper]
- UofT
- Adaptively configure resource parameters
- Built on contextual bandit techniques
- Balance between performance and resource cost
- Not All Resources are Visible: Exploiting Fragmented Shadow Resources in Shared-State Scheduler Architecture [Paper]
- SJTU & Huawei
- Shared-state schedulers: A central state view periodically updates the global cluster status to distributed schedulers
- Shadow resources: Resources invisible to shared-state schedulers until the next view update
- Resource Miner (RMiner) includes a shadow resource manager to manage shadow resources, an RM filter to select suitable tasks as RM tasks, an RM scheduler to allocate shadow resources to RM tasks
- Gödel: Unified large-scale resource management and scheduling at ByteDance [Paper]
- ByteDance & UVA
- Industry Paper
- A unified infrastructure for all business groups to run their diverse workloads
- Built upon Kubernetes
- Anticipatory Resource Allocation for ML Training Clusters [Paper]
- Microsoft Research & UW
- Schedule based on predictions of future job arrivals and durations
- Deal with prediction errors
- tf.data service: A Case for Disaggregating ML Input Data Processing [Paper]
- Google & ETH
- Industry Paper
- A disaggregated input data processing service built on top of tf.data in TensorFlow
- Horizontally scale out to right-size host resources (CPU/RAM) for data processing in each job
- Share ephemeral preprocessed data results across jobs
- Coordinated reads to avoid stragglers
- Is Machine Learning Necessary for Cloud Resource Usage Forecasting? [Paper]
- IMDEA Software Institute
- Vision Paper
- Question: Whether complex machine learning models are necessary to use?
- Proposal: Practical memory management systems need to first identify the extent to which simple solutions can be effective.
- Golgi: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing [Paper]
- HKUST & WeBank
- Best Paper Award!
- A scheduling system for serverless functions to minimize resource provisioning costs while meeting the function latency requirements
- Overcommit functions based on their past resource usage; Identify nine low-level metrics (e.g., request load, resource allocation, contention on shared resources); Use the Mondrian Forest to predict the function performance
- Employ a conservative exploration-exploitation strategy for request routing; By default, route requests to non-overcommitted instances; Explore to use overcommitted instances
- Vertical scaling to dynamically adjust the concurrency of overcommitted instances
- Parrotfish: Parametric Regression for Optimizing Serverless Functions [Paper]
- UBC & UTokyo & INSAT
- Find optimal configurations through an online learning process
- Use parametric regression to choose the right memory configurations for serverless functions
- AsyFunc: A High-Performance and Resource-Efficient Serverless Inference System via Asymmetric Functions [Paper] [Code]
- HUST & Huawei & Peng Cheng Laboratory
- Problem: The time-consuming and resource-hungry model-loading process when scaling out function instances
- Observation: The sensitivity of each layer to the computing resources is mostly anti-correlated with its memory resource usage
- Asymmetric Functions
- The original Body Function loads a complete model to meet stable demands
- The proposed lightweight Shadow Function only loads a portion of resource-sensitive layers to deal with sudden demands effortlessly
- AsyFunc — an inference serving system with an auto-scaling and scheduling engine; Built on top of Knative
- Chitu: Accelerating Serverless Workflows with Asynchronous State Replication Pipeline [Paper] [Code]
- ISCAS & ICT, CAS
- Asynchronous State Replication Pipelines (ASRP) to speed up serverless workflows for general applications
- Three insights
- Provide differentiable data types (DDT) at the programming model level to support incremental state sharing and computation
- Continuously deliver changes of DDT objects in real-time
- Direct communication and change propagation
- Built atop OpenFaaS
- How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads [Paper] [Trace]
- Huawei
- Industry Paper
- Two new serverless traces in Huawei Cloud
- The first trace: Huawei's internal workloads; Per-second statistics for 200 functions
- The second trace: Huawei's public FaaS platform; Per-minute arrival rates for over 5000 functions
- Characterize resource consumption, cold-start times, programming languages used, periodicity, per-second versus per-minute burstiness, correlations, and popularity.
- Findings
- Requests vary by up to 9 orders of magnitude across functions, with some functions executed over 1 billion times per day
- Scheduling time, execution time and cold-start distributions vary across 2 to 4 orders of magnitude and have very long tails
- Function invocation counts demonstrate strong periodicity for many individual functions and on an aggregate level
- The need for further research in estimating resource reservations and time-series prediction
- Function as a Function [Paper]
- ETH
- Vision Paper
- Dandelion -- a clean state FaaS system; Treat serverless functions as pure functions; Explicitly separate computation and I/O; Hardware acceleration; Enable dataflow-aware function orchestration
- The Gap Between Serverless Research and Real-world Systems [Paper]
- SJTU & Huawei Cloud
- Vision Paper
- Five open challenges
- Optimize cold start latency: Most existing works only consider synchronous starts; Asynchronous start in Industry
- Declarative approach: Whether Kubernetes is the right system for serverless computing?
- Scheduling cost
- Balance different scheduling policies within a serverless system
- Costs of sidecar
- Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale [Paper]
- MIT & NEU
- Significant decreases in both temperature and power draw, reducing power consumption and potentially improving hardware life-span, with minimal impact on job performance