Skip to content

Commit

Permalink
Add PDF page limit for IiifPring splitter, PdfPage as pdf_split_child…
Browse files Browse the repository at this point in the history
…_model for BookContribution (#545)
  • Loading branch information
cziaarm authored May 22, 2024
1 parent 90202b7 commit f83cc25
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 1 deletion.
2 changes: 1 addition & 1 deletion app/models/book_contribution.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ class BookContribution < ActiveFedora::Base
# Adds behaviors for DataCite DOIs via hyrax-doi plugin.
include Hyrax::DOI::DataCiteDOIBehavior
include IiifPrint.model_configuration(
pdf_split_child_model: self
pdf_split_child_model: PdfPage
)

self.indexer = BookContributionIndexer
Expand Down
41 changes: 41 additions & 0 deletions config/initializers/iiif_print.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
# Override IiifPrint::Configuration to allow a config item to limit splitting PDFs by page count (IiifPrint 1.0.0 8fdf56e)
IiifPrint::Configuration.class_eval do
attr_writer :split_pdf_page_limit
# rubocop:disable Metrics/MethodLength
# @api private
# @note These fields will appear in rendering order.
# @todo To move this to an `@api public` state, we need to consider what a proper configuration looks like.
def split_pdf_page_limit
@split_pdf_page_limit ||= 100
end
end

IiifPrint.config do |config|
# NOTE: WorkTypes and models are used synonymously here.
# Add models to be excluded from search so the user
Expand Down Expand Up @@ -54,6 +66,9 @@
add_info: {},
collection: {}
}

config.split_pdf_page_limit = 100

end

# Override Hrax::WorkShowPresenter.authorized_item_ids to disallow "Pdf Page" work type from showing as members
Expand All @@ -71,3 +86,29 @@ def authorized_item_ids
end
end

# Override IiifPrint::SplitPdfs::ChildWorkCreationFromPdfService (IiifPrint 1.0.0 8fdf56e)
# IiiifPrint rendering does not do well when there are many pages
# So enforce a page limit over which IiifPRint will not split a PDF
# into childworks with images for each page
# Duplicate pagecount from IiifPrint::SplitPdfs::BaseSplitter
IiifPrint::SplitPdfs::ChildWorkCreationFromPdfService.class_eval do

PAGE_COUNT_REGEXP = %r{^Pages: +(\d+)$}.freeze

def self.pagecount(pdfpath)
# Default to a value that will avoid
# IiifPrint splitting from happening
pagecount=IiifPrint.config.split_pdf_page_limit+1
cmd = "pdfinfo #{pdfpath}"
Open3.popen3(cmd) do |_stdin, stdout, _stderr, _wait_thr|
match = PAGE_COUNT_REGEXP.match(stdout.read)
pagecount = match[1].to_i
end
pagecount
end

def self.pdfs_only_for(paths)
paths.select { |path| path.end_with?('.pdf', '.PDF') && pagecount(path) < IiifPrint.config.split_pdf_page_limit }
end
end

0 comments on commit f83cc25

Please sign in to comment.