Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎁 Adjust Hyrax::DerivativeService.services to Leverage Derivative::Rodeo #219

Open
3 tasks
Tracked by #421 ...
jeremyf opened this issue Apr 14, 2023 · 0 comments
Open
3 tasks
Tracked by #421 ...

Comments

@jeremyf
Copy link
Contributor

jeremyf commented Apr 14, 2023

From the scientist-softserv/adventist_knapsack#406 and building on #218.

Example: Derivatives were pre-processed by the Derivative::Rodeo
Given a <FileSet> (and therefore a Parent Identifier and Original File Name)
And the Derivative::Rodeo has processed the given <FileSet>
When we look for a valid Hyrax::DerivativeService to delegate create_derivatives
Then we should provide a service that can use the files from the rodeo
Example: Derivatives were pre-processed by the Derivative::Rodeo
Given a <FileSet> (and therefore a Parent Identifier and Original File Name)
And the Derivative::Rodeo has not processed the given <FileSet>
When we look for a valid Hyrax::DerivativeService to delegate create_derivatives
Then we should skip the service that can use the files from the rodeo
  • Create a new IiifPrint::DerivativeRodeoService class that conforms to the Hyrax::DerivativeService interface (e.g. #valid?, #create_derivatives, etc).
  • The new service’s valid method will return true when the associated FileSet has files in the rodeo.
  • The new service’s create_derivatives method will take the files in the rodeo and add them as derivatives to the FileSet.

Discussion

As of <2023-04-14 Fri>, the IIIF Print gem is responsible for some of the derivative generation and the remaining derivatives are generated via Hyrax’s Hyrax::FileSetDerivativesService. This is orchestrated by configuring the Hyrax::DerivativeService.services. That configuration is done in IIIF Print’s engine configuration and in the implementation logic of the IiifPrint::PluggableDerivativeService.

By design, the Derivative::Rodeo is envisioned as being able to find and or generate all of the desired derivatives. As of <2023-04-14 Fri> we’re focusing derivative generation on PDFs; in part because that’s the larger group of assets we have for the Adventist ingest.

Those derivatives are:

  • Splitting PDF pages so that there’s one image per page
  • Creating derivatives of each of those one images
  • Perhaps more, but for now this is the focal concern

The default Hyrax service creates the derivatives and assigns them, all as part of a singular derivative process. However, when the rodeo is valid, we will need to know what derivatives to associate with the FileSet.

Note: the current IiifPrint::PluggableDerivativeService may no longer make sense as implemented. However, given that we have implementations that rely on the existing IiifPrint::PluggableDerivativeService we should create a separate derivative service that we can develop and eventually replace (in the default configuration) the PluggableDerivativeService.

@jeremyf jeremyf changed the title Adjust Hyrax::DerivativeService.services to Leverage Derivative::Rodeo 🎁 Adjust Hyrax::DerivativeService.services to Leverage Derivative::Rodeo May 22, 2023
jeremyf added a commit that referenced this issue May 24, 2023
Why as a development dependency?  Because the DerivativeRodeo introduces
a dependency on Faraday >= 1.  And the Valkyrie and ActiveFedora
versions which Hyrax 2 and 3 depend on have a Faraday dependency of < 1.

I am pushing this up so that I can begin development on the ingest
aspect of the Derivative Rodeo.  Also to see how this resolves in our CI
setup and to see the impact, if any on downstream implementations of
IIIF Print (e.g. Adventist, British Library, ATLA, PALNI/PALCI, UTK, and
others).

The plan is to determine if we want to have this Faraday conflict setup
or if we want to swap out something else in the underlying
DerivativeRodeo.

Related to:

- https://github.com/scientist-softserv/adventist-dl/issues/330
- #219
- #220
jeremyf added a commit that referenced this issue May 24, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
@jeremyf jeremyf self-assigned this May 24, 2023
jeremyf added a commit that referenced this issue May 24, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
jeremyf added a commit to scientist-softserv/derivative_rodeo that referenced this issue May 25, 2023
As part of working on #219, I was looking into "What is the pathing
convention for the split images?"  As I was reading the code to get to
that answer, I realized that I wanted to see that information in the
`DerivativeRodeo::Generators::PdfSplitGenerator`.

Further, given that the PdfSplitGenerator is part of the more visible
"interface" of the DerivativeRodeo, I think it is important for the
filename to be present.

Thus I moved towards this refactor, to better expose that and remove
additional chatter in the code.  There are still a few things remaining
but this has helped re-inforce my understanding of the PDF Splitting.

I could further "compress" the basename template concept into the fully
qualitified filename; but I think that would confuse concepts.

Related to:

- scientist-softserv/iiif_print#219
jeremyf added a commit to scientist-softserv/derivative_rodeo that referenced this issue May 25, 2023
As part of working on #219, I was looking into "What is the pathing
convention for the split images?"  As I was reading the code to get to
that answer, I realized that I wanted to see that information in the
`DerivativeRodeo::Generators::PdfSplitGenerator`.

Further, given that the PdfSplitGenerator is part of the more visible
"interface" of the DerivativeRodeo, I think it is important for the
filename to be present.

Thus I moved towards this refactor, to better expose that and remove
additional chatter in the code.  There are still a few things remaining
but this has helped re-inforce my understanding of the PDF Splitting.

I could further "compress" the basename template concept into the fully
qualitified filename; but I think that would confuse concepts.

Related to:

- scientist-softserv/iiif_print#219
jeremyf added a commit that referenced this issue May 30, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
jeremyf added a commit that referenced this issue May 30, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
jeremyf added a commit that referenced this issue May 30, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
jeremyf added a commit that referenced this issue May 30, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
jeremyf added a commit that referenced this issue May 31, 2023
As written, this is a draft commit.  Its purpose is to share the
implementation details and how we might incorporate the DerivativeRodeo
into the derivative generation.

This draft will break the build as it does not include adding the
DerivativeRodeo as a dependency.  An ongoing challenge is that Valkyrie
and ActiveFedora's Faraday requirements conflict with the
DerivativeRodeo.

I put this code forward with lots of comments and the beginnings of
tests to ensure that we're on the right path.

Related to:

- #219
- #243
jeremyf added a commit that referenced this issue May 31, 2023
Adding a class that implements that Hyrax::DerivativeService and makes
the necessary assumptions about how we are creating derivatives using
the rodeo.

There are lots of comments and todos that are a short-cut.

Related to:

- #219
- #243
@jeremyf jeremyf removed their assignment May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants