Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Leveled Compaction in Concourse #527

Open
jtnelson opened this issue Feb 17, 2025 · 0 comments
Open

Implement Leveled Compaction in Concourse #527

jtnelson opened this issue Feb 17, 2025 · 0 comments

Comments

@jtnelson
Copy link
Member

Title:
Implement Leveled Compaction in Concourse

Description:
We need to add a new compaction strategy—Leveled Compaction—to Concourse. The goal is to merge contiguous segments that share the same "level" and bump the level appropriately when merging. A sample implementation is provided below. However, one key challenge is that not all compaction strategies require level metadata, so we may not want to add a getLevel() method directly to the core Segment class.

Proposed Implementation:
Introduce a new compactor class, LeveledCompactor, that follows the existing compaction framework. The sample implementation is as follows:

/*
 * Copyright (c) 2013-2025 Cinchapi Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package com.cinchapi.concourse.server.storage.db.compaction.leveled;

import java.util.List;

import com.cinchapi.concourse.server.storage.db.SegmentStorageSystem;
import com.cinchapi.concourse.server.storage.db.compaction.Compactor;
import com.cinchapi.concourse.server.storage.db.kernel.Segment;
import com.google.common.collect.ImmutableList;

/**
 * A {@link Compactor} that implements a simplified leveled compaction.
 * <p>
 * In this strategy, contiguous {@link Segment Segments} that belong to the
 * same level are merged. For segments in level 0 (which typically represent
 * the unsorted or “raw” data), merging bumps the resulting segment to level 1.
 * For segments in higher levels the level remains the same.
 * </p>
 * <p>
 * <strong>Note:</strong> This implementation assumes that {@link Segment} has
 * been extended (or wrapped) to support level metadata via a {@code getLevel()}
 * (and optionally {@code setLevel(int)}) method. If not, the implementation
 * defaults to treating segments as level 0.
 * </p>
 *
 * @author Jeff Nelson
 */
public class LeveledCompactor extends Compactor {

    /**
     * The minimum number of segments to compact.
     */
    private final int minSegmentsToCompact = 2;

    /**
     * Construct a new instance.
     * 
     * @param storage the underlying {@link SegmentStorageSystem}
     */
    public LeveledCompactor(SegmentStorageSystem storage) {
        super(storage);
    }

    @Override
    protected List<Segment> compact(Segment... segments) {
        // Ensure there is something to compact.
        if (segments.length < minSegmentsToCompact) {
            return null;
        }

        // Check that all segments share the same level.
        int level = getLevel(segments[0]);
        for (int i = 1; i < segments.length; i++) {
            if (getLevel(segments[i]) != level) {
                // Do not compact segments from different levels.
                return null;
            }
        }

        // Compute total size and count for the merged segment.
        long totalLength = 0;
        int totalCount = 0;
        for (Segment seg : segments) {
            totalLength += seg.length();
            totalCount += seg.count();
        }

        // Ensure there is sufficient disk space.
        if (storage().availableDiskSpace() < totalLength) {
            return null;
        }

        // Determine the level for the merged segment:
        // bump level from 0 to 1; if already above 0, keep the same level.
        int newLevel = (level == 0) ? 1 : level;

        // Create a new segment with the appropriate capacity and level.
        Segment merged = createLeveledSegment(totalCount, newLevel);

        // Merge the writes from all segments in order.
        // (In a real-world scenario, you may need to preserve ordering semantics.)
        for (Segment seg : segments) {
            for (Object write : seg.writes()) {
                merged.acquire(write);
            }
        }

        return ImmutableList.of(merged);
    }

    /**
     * Helper method to create a new leveled segment.
     * <p>
     * This method assumes that the {@link Segment} class (or its subclass) has
     * been extended to support level metadata. For example, there might be a
     * factory method such as {@code Segment.create(int capacity, int level)}.
     * </p>
     * 
     * @param capacity the estimated number of writes
     * @param level the level to assign to the new segment
     * @return a new {@link Segment} with the specified capacity and level
     */
    private Segment createLeveledSegment(int capacity, int level) {
        // For demonstration, we call Segment.create(capacity).
        // In a real implementation, you would have a factory method that accepts level info.
        Segment seg = Segment.create(capacity);
        // If available, set the level on the new segment:
        // seg.setLevel(level);
        return seg;
    }

    /**
     * Helper method to retrieve the level of a segment.
     * <p>
     * This method assumes that the {@link Segment} class has been extended to include
     * a level attribute via a {@code getLevel()} method. If such a method is not available,
     * the segment is assumed to be level 0.
     * </p>
     * 
     * @param seg the {@link Segment} whose level is to be determined
     * @return the level of the segment, or 0 if not available
     */
    private int getLevel(Segment seg) {
        try {
            // If Segment is extended to include level information:
            return seg.getLevel();
        }
        catch (NoSuchMethodError e) {
            // Default to level 0 if no level information is present.
            return 0;
        }
    }
}

Key Considerations & Alternatives:

  • Avoiding Changes to the Core Segment Class:
    Not every compaction strategy will need to deal with levels. Instead of adding getLevel() (and related setters) directly to the Segment class, we could consider:

    1. Subclassing:
      Create a subclass (e.g., LeveledSegment) that extends Segment and adds the level metadata. This keeps the core Segment class unchanged while allowing leveled compaction to work on instances of LeveledSegment.

    2. Adapter/Wrapper Pattern:
      Introduce a wrapper around Segment that provides level metadata for the purposes of leveled compaction. This would let us leave the original Segment class intact.

    3. External Metadata Mapping:
      Maintain an external mapping (perhaps within the SegmentStorageSystem) that associates segment identifiers with level metadata. The compactor would consult this mapping when merging segments.

    4. Interface Implementation:
      Define an interface (e.g., ILeveledSegment) with getLevel() and setLevel() methods. Only segments that need level functionality would implement this interface, leaving the others unaffected.

  • Testing & Validation:
    Ensure that the new leveled compaction logic integrates well with existing strategies. Comprehensive unit and integration tests should be added.

Acceptance Criteria:

  • A new LeveledCompactor is implemented and integrated within the compaction framework.
  • The design decision regarding handling level metadata (subclassing, adapter, or external mapping) is finalized.
  • Unit tests cover the leveled compaction logic and its interaction with other strategies.
  • Documentation is updated to explain the leveled compaction strategy and the chosen approach for managing level metadata.

Next Steps:

  • Discuss and decide on the preferred approach for managing level metadata without modifying the core Segment class.
  • Implement the chosen approach and integrate the leveled compactor.
  • Run tests and review performance impacts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant