[RFC] User-space memory cache for preheating jobs #3742

SouthWest7 · 2025-01-02T08:34:25Z

Feature request:

Currently, Dragonfly downloads data directly to disk when processing preheat tasks. To enhance performance and reduce latency, I propose introducing a caching mechanism. This will optimize both download and upload efficiency by writing data to both memory and disk, allowing faster access from memory while ensuring data persistence on disk. Specifically, the caching mechanism will work as follows:

During preheat tasks: While downloading the data to disk, the content will be written to the cache, enabling faster access in future operations.
During regular uploads: The system will first check the cache for the required content. If a cache miss occurs, it will then fall back to disk storage to retrieve the data. It will be faster than reading from the disk, because there is no page cache for the first read.

This approach aims to reduce disk IO, improve overall system efficiency, and significantly lower the time spent retrieving data from remote peers during preheat tasks.

Use case:

UI Example:

Scope:

The caching mechanism will only affect whether piece content is read/written from the cache or disk during downloads and uploads. Other functional modules are not impacted.

Design

Write to Cache

Goal: Store downloaded piece content into the local cache after retrieving it from a remote peer.

Read from Cache

Goal: Retrieve piece content from the cache. If the cache does not contain the data, fall back to reading from local storage.

Definition

pub struct Cache {
    /// pieces stores the pieces with their piece id and content.
    pieces: Arc<Mutex<LruCache<String, bytes::Bytes>>>,
}

/// Cache implements the cache for storing piece content by LRU algorithm.
impl Cache {
    /// new creates a new cache with the specified capacity.
    pub fn new(config: Arc<Config>) -> Result<Self>

    /// read_piece reads the piece from the cache.
    pub async fn read_piece(
        &self,
        piece_id: &str,
        offset: u64,
        length: u64,
        range: Option<Range>,
    ) -> Result<impl AsyncRead>

    /// write_piece writes the piece content to the cache.
    pub async fn write_piece<R: AsyncRead + Unpin + ?Sized>(
        &self,
        piece_id: &str,
        reader: &mut R,
    ) -> Result<()>

    /// is_empty checks if the cache is empty.
    pub fn is_empty(&self) -> bool

    /// contains_piece checks whether the piece exists in the cache.
    pub fn contains_piece(&self, id: &str) -> bool
}

Configuration

storage:
  # capacity: Specifies the maximum number of entries the cache can hold. The default value is 100 entries.
  # Adjust this value based on the expected number of piece content entries for preheat tasks that need to be cached.
  cache_capacity: 100

API Definition

message Download{
  // load_to_cache indicates whether the content downloaded will be stored in the storage cache.
  // Storage cache is designed to store downloaded piece content from preheat tasks, 
  // allowing other peers to access the content from memory instead of disk.
  bool load_to_cache = 21;
}

SouthWest7 · 2025-01-14T06:45:31Z

Actions:

protocol definition & configuration: w1
implementation: w1
full process : w2
unit tests, E2E, performance testing: w3

SouthWest7 · 2025-01-17T09:22:11Z

Test

Description

I tested the new feature locally, running dfget on the seed peer to download the same file.
The difference is that：

Without cache. Running dfget on the seed peer to download the file having no cache.
With cache. Starting the preheat job, the seed peer loads part of the content to the cache.

Settings

Peers:

A Seed Peer

Target File:

Name: Llama-3.2-3B-Instruct/model-00002-of-00002.safetensors
Size: 1.46GB
Piece Number: 348

Test Cases

Case 1

Preheat piece content to the cache of the seed peer.
Measure the time taken by dfget to download in seed peer from local.

Case 2

Use dfget to store piece content in the disk of the seed peer.
Measure the time taken by dfget to download in seed peer from local.

Result

With Cache

	task start	task succeed	total	hit cache start	hit cache end	total of cache
test_1	34.5651	39.7431	5.1780	38.0111	39.7426	1.7315
test_2	03.2694	11.9960	8.7266	10.1520	11.9954	1.8434

Without Cache

	task start	task succeed	total
test_1	36.7624	44.2836	7.5212
test_2	28.9720	40.2407	11.2687

Note: During test_1, the SSD read speed was very fast, while during test_2, the SSD read speed was relatively slow. As a result, the overall performance of test_2 is slower than that of test_1.

Video links

With Cache:
https://pan.baidu.com/s/1klKHjzX4dqzVRuI2Crq1Lg?pwd=wang
Without Cache:
https://pan.baidu.com/s/1SG-OBtKic96lBDthmiEhQA?pwd=js1f

SouthWest7 added the enhancement New feature or request label Jan 2, 2025

This was referenced Jan 3, 2025

feat: add 'load_to_cache' field to Download message dragonflyoss/api#437

Merged

feat: add cache configuration support for preheat tasks dragonflyoss/client#930

Merged

gaius-qi assigned SouthWest7 Jan 7, 2025

SouthWest7 closed this as completed Jan 14, 2025

SouthWest7 reopened this Jan 14, 2025

SouthWest7 changed the title ~~[RFC] Memory cache for preheat tasks~~ [RFC] User-space memory cache for preheating jobs Jan 16, 2025

This was referenced Jan 16, 2025

feat: add a LRU-based cache module for preheat jobs dragonflyoss/client#945

Open

feat: add 'LoadToCache' in Download message when starting a preheat job #3769

Closed

SouthWest7 closed this as completed Jan 17, 2025

SouthWest7 reopened this Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] User-space memory cache for preheating jobs #3742

[RFC] User-space memory cache for preheating jobs #3742

SouthWest7 commented Jan 2, 2025 •

edited

Loading

SouthWest7 commented Jan 14, 2025 •

edited

Loading

SouthWest7 commented Jan 17, 2025 •

edited

Loading

[RFC] User-space memory cache for preheating jobs #3742

[RFC] User-space memory cache for preheating jobs #3742

Comments

SouthWest7 commented Jan 2, 2025 • edited Loading

Feature request:

Use case:

UI Example:

Scope:

Design

Write to Cache

Read from Cache

Definition

Configuration

API Definition

SouthWest7 commented Jan 14, 2025 • edited Loading

Actions:

SouthWest7 commented Jan 17, 2025 • edited Loading

Test

Description

Settings

Peers:

Target File:

Test Cases

Case 1

Case 2

Result

With Cache

Without Cache

Video links

SouthWest7 commented Jan 2, 2025 •

edited

Loading

SouthWest7 commented Jan 14, 2025 •

edited

Loading

SouthWest7 commented Jan 17, 2025 •

edited

Loading