Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging Two Sorted Sequences, Attempt 2 #184

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

CTMacUser
Copy link
Contributor

Description

This is an adaptation of the std::merge function from the C++ language into Swift. This is the second version I've done, after [#43]. The main changes are:

  • It uses a free function to call merge, instead of a method that extends Sequence and LazySequenceProtocol. This necessitates a new name for the lazy version (now called lazilyMerge), since a simple overload would prevent creating eagerly-generated mergers from lazy arguments. This is inline with other functions in this library that changed into free functions.
  • There is no longer Collection extended support. Indexing support adds a new layer of indirection. There are two ways to handle that; either rewrite the code in the Collection index methods, hoping desynchronisation doesn't happen during feature changes or bug fixes, or use the same code base by generalizing it with a bigger layer of indirection. That's what the previous version did, which was hard to implement. Giving up and directing users to just call the eager version is safer.

Detailed Design

The task is implemented as a set of overloaded free functions; a configuration type to reuse the algorithm for set-union, -intersection, etc.; a lazy sequence wrapper; and an iterator that contains the core code.

/// Binary (multi-)set operations, using combinations of keeping or removing
/// shared and/or disjoint elements.
public enum SetOperation: UInt, CaseIterable {
  case none, firstWithoutSecond, secondWithoutFirst, symmetricDifference,
       intersection, first, second, union, sum
}
extension SetOperation {
    @inlinable public var usesExclusivesFromFirst: Bool { get }
    @inlinable public var usesExclusivesFromSecond: Bool { get }
    @inlinable public var usesShared: Bool { get }
    @inlinable public var duplicatesShared: Bool { get }
    @inlinable public init(
        keepExclusivesToFirst: Bool,
        keepExclusivesToSecond: Bool,
        keepShared: Bool
    )
}

/// Merges the two given sorted sequences into a sorted array, but retaining
/// only the given subset of elements from the merger.
@inlinable
public func merge<Base1, Base2>(
    _ first: Base1, _ second: Base2, keeping operation: SetOperation = .sum
) -> [Base1.Element]
where Base1 : Sequence, Base2 : Sequence, Base1.Element : Comparable,
      Base1.Element == Base2.Element
/// Merges the two given sorted collections into a new sorted collection, but
/// retaining only the given subset of elements from the merger.
@inlinable
public func merge<Base>(
    _ first: Base, _ second: Base, keeping operation: SetOperation = .sum
) -> Base
where Base : RangeReplaceableCollection, Base.Element : Comparable
/// Merges the two given sequences, each sorted using the given predicate as the
/// comparison between elements, into a sorted array, but retaining only the
/// given subset of elements from the merger.
@inlinable
public func merge<Base1, Base2>(
    _ first: Base1, _ second: Base2, keeping operation: SetOperation = .sum,
    along areInIncreasingOrder: (Base1.Element, Base2.Element) throws -> Bool
) rethrows -> [Base1.Element]
where Base1 : Sequence, Base2 : Sequence, Base1.Element == Base2.Element
/// Merges the two given collections, each sorted using the given predicate as
/// the comparison between elements, into a sorted collection, but retaining
/// only the given subset of elements from the merger.
@inlinable
public func merge<Base>(
    _ first: Base, _ second: Base, keeping operation: SetOperation = .sum,
    along areInIncreasingOrder: (Base.Element, Base.Element) throws -> Bool
) rethrows -> Base
where Base : RangeReplaceableCollection

/// Lazily merges the two given sorted lazy sequences into a new sorted lazy
/// sequence, where only the given subset of merged elements is retained.
@inlinable
public func lazilyMerge<Base1, Base2>(
    _ first: Base1, _ second: Base2, keeping operation: SetOperation = .sum
) -> Merged2Sequence<Base1.Elements, Base2.Elements>
where Base1 : LazySequenceProtocol, Base2 : LazySequenceProtocol,
      Base1.Element : Comparable, Base1.Element == Base2.Element
/// Lazily merges the two given lazy sequences, each sorted using the given
/// predicate as the comparison between elements, into a new sorted lazy
/// sequence, where only the given subset of merged elements is retained.
@inlinable
public func lazilyMerge<Base1, Base2>(
    _ first: Base1, _ second: Base2, keeping operation: SetOperation = .sum,
    along areInIncreasingOrder: @escaping (Base1.Element, Base2.Element) -> Bool
) -> Merged2Sequence<Base1.Elements, Base2.Elements>
where Base1 : LazySequenceProtocol, Base2 : LazySequenceProtocol,
      Base1.Element == Base2.Element

/// A sequence vending the sorted merger of its source sequences.
public struct Merged2Sequence<Base1, Base2>
where Base1 : Sequence, Base2 : Sequence, Base1.Element == Base2.Element {
}
extension Merged2Sequence : LazySequenceProtocol {
    public typealias Element = Base1.Element
    public typealias Iterator = Merged2Iterator<Base1.Iterator, Base2.Iterator>
    public func makeIterator() -> Iterator
    public var underestimatedCount: Int { get }
    public func withContiguousStorageIfAvailable<R>(
        _ body: (UnsafeBufferPointer<Element>) throws -> R
    ) rethrows -> R?

    public func _customContainsEquatableElement(_ element: Element) -> Bool?
}

/// An iterator vending the sorted merger of its source iterators.
public struct Merged2Iterator<Base1, Base2>
where Base1 : IteratorProtocol, Base2 : IteratorProtocol,
      Base1.Element == Base2.Element {
}
extension Merged2Iterator : IteratorProtocol {
    public mutating func next() -> Base1.Element?
}

Documentation Plan

There is a new Guide page provided.

Test Plan

There is a new test code file provided.

Source Impact

The change is additive. It does take two names out of the global space.

Checklist

  • I've added at least one test that validates that my change is working, if appropriate
  • I've followed the code style of the rest of the project
  • I've read the Contribution Guidelines
  • I've updated the documentation if necessary

Add top-level functions that take two sorted sequences and returns their also-sorted merger. There are eager and lazy variants.
Add test case for mergers that use a custom ordering predicate. Said predicate doesn't use all of its operands' data, so priorities on which source is used for a shared value is detectable.
Take the giant test case for the operations of the (lazy) merger sequence that don't involve the iterator, and split it into test cases for each operation separately. Use two tests for element-containment support, branching on how well the source sequences support element-containment tests.
@CTMacUser CTMacUser mentioned this pull request Mar 11, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant