Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-46799: refactor DatasetRecordStorageManager #1095

Merged
merged 10 commits into from
Oct 16, 2024
Merged

Commits on Oct 16, 2024

  1. Configuration menu
    Copy the full SHA
    d10be56 View commit details
    Browse the repository at this point in the history
  2. Minor doc fixes.

    TallJimbo committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    22c558c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    832e843 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8b861aa View commit details
    Browse the repository at this point in the history
  5. Remove vestiges of autoincrement integer dataset ID support.

    The lone concrete dataset storage manager class and its intermediate
    base class have been merged, and many function parameters (all
    internal, of course) that plumb the SQL dataset ID type through many
    layers have been dropped.
    TallJimbo committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    349d1ac View commit details
    Browse the repository at this point in the history
  6. Move methods from DatasetRecordStorage to its manager.

    This is a mostly-mechanical change; for each method:
    
    - move a method to the manager class;
    
    - give it a dataset type argument, and have it look up the storage
      object internally;
    
    - change calling code (SqlRegistry) to stop looking up the storage
      object before calling it.
    
    The ultimate goal is to get rid of DimensionRecordStorageManager, at
    least as a public interface.  But the steps after this one won't be
    nearly as mechanical, so it's useful to separate them.
    TallJimbo committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    a6a36e8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    781bb32 View commit details
    Browse the repository at this point in the history
  8. Rework dataset manager and dataset type caching.

    The dataset type cache now holds just the dataset type definition and
    dataset ID by name, with the SQLAlchemy table objects instead cached
    by DimensionGroup so they can be shared by multiple dataset types.
    
    DatasetRecordStorage has been removed (both the base class and its
    sole implementation) - its methods had already been moved to
    DatasetRecordStorageManager, and it no longer works as the opaque
    thing to put in the cache.  Instead there's a new subpackage-private
    DynamicTables class that is cached by DimensionGroup (this is where
    the lazy loading of SQLAlchemy table objects now happens), and a
    module-private _DatasetRecordStorage struct that just combines that
    with the dataset type and its ID, to make it more convenient to pass
    all of that around.
    
    I also threw in some changes to the insert/import implementations
    because I started trying to reduce the degree that
    DatasetRecordStorage was being passed things that were either totally
    unused or assumed (without checking) have some value.  I quickly
    realized that this problem is ubiquitous (especially with storage
    classes) and should be a separate ticket, but I've kept what I already
    did since I think it's a step in the right direction.
    TallJimbo committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    9a12c91 View commit details
    Browse the repository at this point in the history
  9. Ban multiple-dataset-type results in general find-first queries.

    Supporting these would be extra complexity I don't think we need.
    TallJimbo committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    0c3e4ca View commit details
    Browse the repository at this point in the history
  10. Switch back to returning given DatasetType in registry imports.

    When the given dataset type differs from the registered dataset type
    in imports, it's not clear what the ideal behavior is, but the right
    choice for *this* ticket is clearly to not change that behavior.
    TallJimbo committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    79f0f65 View commit details
    Browse the repository at this point in the history