Skip to content

Commit

Permalink
Merge branch 'main' into rails_hyrax_upgrade
Browse files Browse the repository at this point in the history
* main:
  🎁 Add Transaction for Cleaning Up Split Pages
  ♻️ Extract direct ActiveFedora calls to adapter
  • Loading branch information
jeremyf committed Jan 19, 2024
2 parents 8066b7e + acc40d2 commit f06554f
Show file tree
Hide file tree
Showing 21 changed files with 378 additions and 53 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,10 @@ uv = createUV('#uv', {
## Configuration to enable IiifPrint features
**NOTE: WorkTypes and models are used synonymously here.**

### Persistence Layer Adapter

We created IiifPrint with an assumption of ActiveFedora. However, as Hyrax now supports Valkyrie, we need an alternate approach. We introduced `IiifPrint::Configuration#persistence_layer` as a configuration option. By default it will use `ActiveFedora` methods; but you can switch adapters to use Valkyrie instead. (See `IiifPrint::PersistentLayer` for more details).

### IIIF URL configuration

If you set EXTERNAL_IIIF_URL in your environment, then IiifPrint will use that URL as the root for your IIIF URLs. It will also switch from using the file set ID to using the SHA1 of the file as the identifier. This enables using serverless_iiif or Cantaloupe (refered to as the service) by pointing the service to the same S3 bucket that FCREPO writes the uploaded files to. By setting it up that way you do not need the service to connect to FCREPO or Hyrax at all, both natively support connecting to an S3 bucket to get their data.
Expand Down
2 changes: 1 addition & 1 deletion app/actors/iiif_print/actors/file_set_actor_decorator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def destroy
# we destroy the children before the file_set, because we need the parent relationship
IiifPrint::SplitPdfs::DestroyPdfChildWorksService.conditionally_destroy_spawned_children_of(
file_set: file_set,
work: file_set.parent
work: IiifPrint.parent_for(file_set)
)
# and now back to your regularly scheduled programming
super
Expand Down
2 changes: 1 addition & 1 deletion app/models/concerns/iiif_print/solr/document.rb
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def digest_sha1

def method_missing(method_name, *args, &block)
super unless iiif_print_solr_field_names.include? method_name.to_s
self[::ActiveFedora.index_field_mapper.solr_name(method_name.to_s)]
self[IiifPrint.solr_name(method_name.to_s)]
end

def respond_to_missing?(method_name, include_private = false)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ def get_solr_hits(ids)
results = []
ids.each_slice(SOLR_QUERY_PAGE_SIZE) do |paged_ids|
query = "id:(#{paged_ids.join(' OR ')})"
results += ActiveFedora::SolrService.query(
results += IiifPrint.solr_query(
query,
{ fq: "-has_model_ssim:FileSet", rows: paged_ids.size, method: :post }
)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
module Hyrax
module Transactions
##
# This decorator does the following:
#
# - Prepend the {ConditionallyDestroyChildrenFromSplit} transaction to the "file_set.destroy"
# step. The prependment corresponds to the behavior for
# {IiifPrint::Actors::FileSetActorDecorator#destroy}
#
# For more information about adjusting transactions, see
# [Transitioning workshop solution for adding transaction](https://github.com/samvera-labs/transitioning-to-valkyrie-workshop/commit/bcab2bb8f65078e88395c68f72be00e7ffad57ec)
#
# @see https://github.com/samvera/hyrax/blob/f875d61dc87229cf1f05eb2bb6d414b5ef314616/lib/hyrax/transactions/container.rb
class IiifPrintContainerDecorator
extend Dry::Container::Mixin

namespace 'file_set' do |ops|
ops.register 'iiif_print_conditionally_destroy_spawned_children' do
Steps::ConditionallyDestroyChildrenFromSplit.new
end
ops.register 'destroy' do
Hyrax::Transactions::FileSetDestroy.new(
steps: (['file_set.iiif_print_conditionally_destroy_spawned_children'] +
Hyrax::Transactions::FileSetDestroy::DEFAULT_STEPS)
)
end
end
end
end
end

Hyrax::Transactions::Container.merge(Hyrax::Transactions::IiifPrintContainerDecorator)
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
module Hyrax
module Transactions
module Steps
##
# For a FileSet that is a PDF, we need to delete any works and file_sets that are the result of
# splitting that PDF into constituent images of each page of the PDF. This is responsible for
# that work.
class ConditionallyDestroyChildrenFromSplit
include Dry::Monads[:result]

##
# @param resource [Hyrax::FileSet]
def call(resource)
return Failure(:resource_not_persisted) unless resource.persisted?

parent = IiifPrint.persistence_adapter.parent_for(resource)
return Success(true) unless parent

# We do not care about the results of this call; as it is conditionally looking for things
# to destroy.
IiifPrint::SplitPdfs::DestroyPdfChildWorksService.conditionally_destroy_spawned_children_of(
file_set: resource,
work: parent
)

Success(true)
end
end
end
end
end
45 changes: 15 additions & 30 deletions lib/iiif_print.rb
Original file line number Diff line number Diff line change
Expand Up @@ -44,37 +44,22 @@ def self.config(&block)
end

class << self
delegate :skip_splitting_pdf_files_that_end_with_these_texts, to: :config
end

##
# Return the immediate parent of the given :file_set.
#
# @param file_set [FileSet]
# @return [#work?, Hydra::PCDM::Work]
# @return [NilClass] when no parent is found.
def self.parent_for(file_set)
# fallback to Fedora-stored relationships if work's aggregation of
# file set is not indexed in Solr
file_set.parent || file_set.member_of.find(&:work?)
end
delegate(
:persistence_adapter,
:skip_splitting_pdf_files_that_end_with_these_texts,
to: :config
)

##
# Return the parent's parent of the given :file_set.
#
# @param file_set [FileSet]
# @return [#work?, Hydra::PCDM::Work]
# @return [NilClass] when no grand parent is found.
def self.grandparent_for(file_set)
parent_of_file_set = parent_for(file_set)
# HACK: This is an assumption about the file_set structure, namely that an image page split from
# a PDF is part of a file set that is a child of a work that is a child of a single work. That
# is, it only has one grand parent. Which is a reasonable assumption for IIIF Print but is not
# valid when extended beyond IIIF Print. That is GenericWork does not have a parent method but
# does have a parents method.
parent_of_file_set.try(:parent_works).try(:first) ||
parent_of_file_set.try(:parents).try(:first) ||
parent_of_file_set&.member_of&.find(&:work?)
delegate(
:clean_for_tests!,
:destroy_children_split_from,
:grandparent_for,
:solr_construct_query,
:solr_name,
:solr_query,
:parent_for,
to: :persistence_adapter
)
end

def self.use_valkyrie?(obj)
Expand Down
4 changes: 2 additions & 2 deletions lib/iiif_print/catalog_search_builder.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ class CatalogSearchBuilder < Hyrax::CatalogSearchBuilder
# rubocop:enable Naming/PredicateName
def show_parents_only(solr_parameters)
query = if blacklight_params["include_child_works"] == 'true'
ActiveFedora::SolrQueryBuilder.construct_query(is_child_bsi: 'true')
IiifPrint.solr_construct_query(is_child_bsi: 'true')
else
ActiveFedora::SolrQueryBuilder.construct_query(is_child_bsi: nil)
IiifPrint.solr_construct_query(is_child_bsi: nil)
end
solr_parameters[:fq] += [query]
end
Expand Down
16 changes: 16 additions & 0 deletions lib/iiif_print/configuration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,22 @@ module IiifPrint
class Configuration
attr_writer :after_create_fileset_handler

attr_writer :persistence_adapter
def persistence_adapter
@persistent_adapter || default_persistence_adapter
end

def default_persistence_adapter
# There's probably some configuration of Hyrax we could use to better refine this; but it's
# likely a reasonable guess. The main goal is to not break existing implementations and
# maintain an upgrade path.
if Gem::Version.new(Hyrax::VERSION) >= Gem::Version.new('6.0.0')
IiifPrint::PersistenceLayer::ValkyrieAdapter
else
IiifPrint::PersistenceLayer::ActiveFedoraAdapter
end
end

# @param file_set [FileSet]
# @param user [User]
def handle_after_create_fileset(file_set, user)
Expand Down
2 changes: 1 addition & 1 deletion lib/iiif_print/data/work_derivatives.rb
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ def primary_file_path
# of the first assigned file path for single-file work.
work_file = parent
return if work_file.nil?
work_files = work_file.parent
work_files = IiifPrint.parent_for(work_file)
return if work_files.nil?
work_files.assigned[0]
else
Expand Down
9 changes: 9 additions & 0 deletions lib/iiif_print/engine.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,15 @@ module IiifPrint
class Engine < ::Rails::Engine
isolate_namespace IiifPrint

config.eager_load_paths += %W[#{config.root}/app/transactions]

initializer 'requires' do
require 'hyrax/transactions/iiif_print_container_decorator'
require 'iiif_print/persistence_layer'
require 'iiif_print/persistence_layer/active_fedora_adapter' if defined?(ActiveFedora)
require 'iiif_print/persistence_layer/valkyrie_adapter' if defined?(Valkyrie)
end

# rubocop:disable Metrics/BlockLength
config.to_prepare do
require "iiif_print/jobs/create_relationships_job"
Expand Down
4 changes: 2 additions & 2 deletions lib/iiif_print/homepage_search_builder.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ class HomepageSearchBuilder < Hyrax::HomepageSearchBuilder

def show_parents_only(solr_parameters)
query = if blacklight_params["include_child_works"] == 'true'
ActiveFedora::SolrQueryBuilder.construct_query(is_child_bsi: 'true')
IiifPrint.solr_construct_query(is_child_bsi: 'true')
else
ActiveFedora::SolrQueryBuilder.construct_query(is_child_bsi: nil)
IiifPrint.solr_construct_query(is_child_bsi: nil)
end
solr_parameters[:fq] += [query]
end
Expand Down
58 changes: 58 additions & 0 deletions lib/iiif_print/persistence_layer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
module IiifPrint
##
# The PersistenceLayer module provides the namespace for other adapters:
#
# - {IiifPrint::PersistenceLayer::ActiveFedoraAdapter}
# - {IiifPrint::PersistenceLayer::ValkyrieAdapter}
#
# And the defining interface in the {IiifPrint::PersistenceLayer::AbstractAdapter}
module PersistenceLayer
# @abstract
class AbstractAdapter
##
# @param file_set [Object]
# @param work [Object]
# @param model [Class] The class name for which we'll split children.
def self.destroy_children_split_from(file_set:, work:, model:)
raise NotImplementedError, "#{self}.{__method__}"
end

##
# @abstract
def self.parent_for(*)
raise NotImplementedError, "#{self}.{__method__}"
end

##
# @abstract
def self.grandparent_for(*)
raise NotImplementedError, "#{self}.{__method__}"
end

##
# @abstract
def self.solr_field_query(*)
raise NotImplementedError, "#{self}.{__method__}"
end

##
# @abstract
def self.clean_for_tests!
return false unless Rails.env.test?
yield
end

##
# @abstract
def self.solr_query(*args)
raise NotImplementedError, "#{self}.{__method__}"
end

##
# @abstract
def self.solr_name(*args)
raise NotImplementedError, "#{self}.{__method__}"
end
end
end
end
83 changes: 83 additions & 0 deletions lib/iiif_print/persistence_layer/active_fedora_adapter.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
module IiifPrint
module PersistenceLayer
class ActiveFedoraAdapter < AbstractAdapter
##
# Return the immediate parent of the given :file_set.
#
# @param file_set [FileSet]
# @return [#work?, Hydra::PCDM::Work]
# @return [NilClass] when no parent is found.
def self.parent_for(file_set)
# fallback to Fedora-stored relationships if work's aggregation of
# file set is not indexed in Solr
file_set.parent || file_set.member_of.find(&:work?)
end

##
# Return the parent's parent of the given :file_set.
#
# @param file_set [FileSet]
# @return [#work?, Hydra::PCDM::Work]
# @return [NilClass] when no grand parent is found.
def self.grandparent_for(file_set)
parent_of_file_set = parent_for(file_set)
# HACK: This is an assumption about the file_set structure, namely that an image page split from
# a PDF is part of a file set that is a child of a work that is a child of a single work. That
# is, it only has one grand parent. Which is a reasonable assumption for IIIF Print but is not
# valid when extended beyond IIIF Print. That is GenericWork does not have a parent method but
# does have a parents method.
parent_of_file_set.try(:parent_works).try(:first) ||
parent_of_file_set.try(:parents).try(:first) ||
parent_of_file_set&.member_of&.find(&:work?)
end

def self.solr_construct_query(*args)
if defined?(Hyrax::SolrQueryBuilderService)
Hyrax::SolrQueryBuilderService.construct_query(*args)
else
ActiveFedora::SolrQueryBuilderService.construct_query(*args)
end
end

def self.clean_for_tests!
super do
ActiveFedora::Cleaner.clean!
end
end

def self.solr_query(*args)
if defined?(Hyrax::SolrService)
Hyrax::SolrService.query(*args)
else
ActiveFedora::SolrService.query(*args)
end
end

def self.solr_name(field_name)
if defined?(Hyrax) && Hyrax.config.respond_to?(:index_field_mapper)
Hyrax.config.index_field_mapper.solr_name(field_name.to_s)
else
::ActiveFedora.index_field_mapper.solr_name(field_name.to_s)
end
end

##
# @param file_set [Object]
# @param work [Object]
# @param model [Class] The class name for which we'll split children.
def self.destroy_children_split_from(file_set:, work:, model:)
# look first for children by the file set id they were split from
children = model.where(split_from_pdf_id: file_set.id)
if children.blank?
# find works where file name and work `to_param` are both in the title
children = model.where(title: file_set.label).where(title: work.to_param)
end
return if children.blank?
children.each do |rcd|
rcd.destroy(eradicate: true)
end
true
end
end
end
end
Loading

0 comments on commit f06554f

Please sign in to comment.