Skip to content

Commit

Permalink
♻️ Preparing Splitter Service Swap for DerivativeRodeo
Browse files Browse the repository at this point in the history
This commit is a refactor in-place.  The primary goal is to allow for
passing the file_set to the child works; something that is ideal for the
derivative rodeo's interface.

This is intended to be a swap-in-place change.  That is to say, if we
deploy this change and have already enqueued jobs, nothing will fail nor
break.  The past enqueued jobs (with a work) will use the work based
logic but future enqueueings will use file_set.

In using the file_set, we also avoid the issue of having passing a `nil`
as the parent, and thus creating an infinite rescheduling cycle.

Related to:

- #220
  • Loading branch information
jeremyf committed May 30, 2023
1 parent 3b92d13 commit 70b17e1
Show file tree
Hide file tree
Showing 6 changed files with 47 additions and 21 deletions.
8 changes: 1 addition & 7 deletions app/services/iiif_print/pluggable_derivative_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -105,16 +105,10 @@ def impending_derivative?(name)
# set would use. That "possibility" is based on the work. Later, we will check the plugin's
# "valid?" which would now look at the specific file_set for validity.
def plugins_for(file_set)
parent = parent_for(file_set)
parent = IiifPrint.parent_for(file_set)
return Array(default_plugins) if parent.nil?
return Array(default_plugins) unless parent.respond_to?(:iiif_print_config)

(parent.iiif_print_config.derivative_service_plugins + Array(default_plugins)).flatten.compact.uniq
end

def parent_for(file_set)
# fallback to Fedora-stored relationships if work's aggregation of
# file set is not indexed in Solr
file_set.parent || file_set.member_of.find(&:work?)
end
end
9 changes: 9 additions & 0 deletions lib/iiif_print.rb
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,15 @@ def self.config(&block)
@config
end

##
# @param file_set [FileSet]
# @return [#work?, Hydra::PCDM::Work]
def self.parent_for(file_set)
# fallback to Fedora-stored relationships if work's aggregation of
# file set is not indexed in Solr
file_set.parent || file_set.member_of.find(&:work?)
end

DEFAULT_MODEL_CONFIGURATION = {
# Split a PDF into individual page images and create a new child work for each image.
pdf_splitter_job: IiifPrint::Jobs::ChildWorksFromPdfJob,
Expand Down
23 changes: 17 additions & 6 deletions lib/iiif_print/jobs/child_works_from_pdf_job.rb
Original file line number Diff line number Diff line change
@@ -1,15 +1,25 @@
module IiifPrint
module Jobs
# @deprecated
class ChildWorksFromPdfJob < IiifPrint::Jobs::ApplicationJob
##
# Break a pdf into individual pages
# @param parent_work
#
# @param candidate_for_parency [FileSet, Hydra::PCDM::Work]
# @param pdf_paths: [<Array => String>] paths to pdfs
# @param user: [User]
# @param admin_set_id: [<String>]
#
# @todo Deprecate the _count parameter; it was once used but not necessary.
def perform(parent_work, pdf_paths, user, admin_set_id, _count)
@parent_work = parent_work
# rubocop:disable Metrics/MethodLength
def perform(candidate_for_parency, pdf_paths, user, admin_set_id, *)
##
# We know that we have cases where parent_work is nil, this will definitely raise an
# exception; which is fine because we were going to do it later anyway.
@parent_work = if candidate_for_parency.work?
candidate_for_parency
else
# We likely have a file set
IiifPrint.parent_for(candidate_for_parency)
end
@child_admin_set_id = admin_set_id
child_model = @parent_work.iiif_print_config.pdf_split_child_model

Expand All @@ -32,12 +42,13 @@ def perform(parent_work, pdf_paths, user, admin_set_id, _count)

# TODO: clean up image_files and pdf_paths
end
# rubocop:enable Metrics/MethodLength

private

# rubocop:disable Metrics/ParameterLists
def split_pdf(original_pdf_path, user, child_model)
image_files = @parent_work.iiif_print_config.pdf_splitter_service.new(original_pdf_path).to_a
image_files = @parent_work.iiif_print_config.pdf_splitter_service.call(original_pdf_path)
return if image_files.blank?

prepare_import_data(original_pdf_path, image_files, user)
Expand Down
16 changes: 15 additions & 1 deletion lib/iiif_print/split_pdfs/base_splitter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,22 @@ module SplitPdfs
#
# The purpose of this class is to split the PDF into constituent image files.
#
# @see #each
# @see .call
class BaseSplitter
##
# @api public
#
# @param path [String] local path to the PDF that we will split.
# @return [Enumerable]
#
# @see #each
#
# @note We're including the ** args to provide method conformity; other services require
# additional information (such as the FileSet)
def self.call(path, **)
new(path).to_a
end

class_attribute :image_extension
class_attribute :compression, default: nil
class_attribute :quality, default: nil
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ class ChildWorkCreationFromPdfService
# @return [TrueClass] when we actually enqueue the job underlying job.
# rubocop:disable Metrics/MethodLength
def self.conditionally_enqueue(file_set:, file:, user:, import_url: nil, work: nil)
work ||= parent_for(file_set: file_set)
work ||= IiifPrint.parent_for(file_set)

return :no_split_for_parent unless iiif_print_split?(work: work)
return :no_pdfs_for_import_url if import_url && !pdfs?(paths: [import_url])
Expand All @@ -35,7 +35,7 @@ def self.conditionally_enqueue(file_set:, file:, user:, import_url: nil, work: n
return :no_pdfs if file_locations.empty?

work.iiif_print_config.pdf_splitter_job.perform_later(
work,
file_set,
file_locations,
user,
admin_set_id,
Expand Down Expand Up @@ -114,11 +114,6 @@ def self.pdfs_only_for(paths)

##
# @api private
def self.parent_for(file_set:)
# fallback to Fedora-stored relationships if work's aggregation of
# file set is not indexed in Solr
file_set.parent || file_set.member_of.find(&:work?)
end
end
end
end
3 changes: 3 additions & 0 deletions spec/iiif_print/split_pdfs/base_splitter_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
RSpec.describe IiifPrint::SplitPdfs::BaseSplitter do
let(:path) { __FILE__ }
let(:splitter) { described_class.new(path) }
subject { described_class }

it { is_expected.to respond_to(:call) }

describe "instance" do
subject { splitter }
Expand Down

0 comments on commit 70b17e1

Please sign in to comment.