Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulkrax importing work #1970

Merged
merged 22 commits into from
Sep 18, 2023
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .env
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
CHROME_HOSTNAME=chrome
COMPOSE_DOCKER_CLI_BUILD=1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aprilrieger this is the secret to eternal youth. or maybe it was faster builds. it was one of those two things

DB_ADAPTER=postgresql
DB_HOST=db
DB_HOST=db
DB_NAME=hyku
DB_PASSWORD=DatabaseFTW
DB_PORT=5432
DB_TEST_NAME=hyku_test
DB_USER=postgres
DB_HOST=db
DB_PORT=5432
DOCKER_BUILDKIT=1
FCREPO_BASE_PATH=/hykudemo
FCREPO_HOST=fcrepo
FCREPO_PORT=8080
Expand All @@ -15,6 +17,7 @@ [email protected]
INITIAL_ADMIN_PASSWORD=testing123
JAVA_OPTS=-Xmx4g -Xms1g
IN_DOCKER=true
JAVA_OPTS=
LD_LIBRARY_PATH=/opt/fits/tools/mediainfo/linux
PASSENGER_APP_ENV=development
RAILS_LOG_TO_STDOUT=true
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/base.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ jobs:
uses: scientist-softserv/actions/.github/workflows/[email protected]
secrets: inherit
with:
platforms: "linux/amd64" # "linux/amd64,linux/arm64"
platforms: "linux/amd64,linux/arm64"
target: hyku-base
image_name: samvera/hyku/base
4 changes: 3 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
ARG HYRAX_IMAGE_VERSION=hyrax-v4.0.0.rc1
ARG RUBY_VERSION=2.7.7
FROM ghcr.io/samvera/hyrax/hyrax-base:$HYRAX_IMAGE_VERSION as hyku-base

USER root
Expand Down Expand Up @@ -84,7 +85,8 @@ RUN ln -sf /usr/lib/libmediainfo.so.0 /app/fits/tools/mediainfo/linux/libmediain
ONBUILD COPY --chown=1001:101 $APP_PATH/bin/db-migrate-seed.sh /app/samvera/

ONBUILD COPY --chown=1001:101 $APP_PATH/Gemfile* /app/samvera/hyrax-webapp/
ONBUILD RUN bundle install --jobs "$(nproc)"
ONBUILD RUN git config --global --add safe.directory /app/samvera && \
bundle install --jobs "$(nproc)"

ONBUILD COPY --chown=1001:101 $APP_PATH /app/samvera/hyrax-webapp

Expand Down
7 changes: 6 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ gem 'blacklight', '~> 6.7'
gem 'blacklight_oai_provider', '~> 6.1', '>= 6.1.1'
gem 'bolognese', '>= 1.9.10'
gem 'bootstrap-datepicker-rails'
gem 'bulkrax', '~> 5.0'
gem 'bulkrax', '~> 5.3'
gem 'byebug', group: %i[development test]
gem 'capybara', group: %i[test]
gem 'capybara-screenshot', '~> 1.0', group: %i[test]
Expand Down Expand Up @@ -90,4 +90,9 @@ gem 'turbolinks', '~> 5'
gem 'web-console', '>= 3.3.0', group: %i[development] # <%= console %> in views
gem 'webdrivers', '~> 4.7.0', group: %i[test]
gem 'webmock', group: %i[test]

# This gem does nothing by default, but is instead a tool to ease developer flow
# and place overrides, themes and deployment code.
gem 'hyku_knapsack', github: 'samvera-labs/hyku_knapsack', branch: 'upstream_main'

# rubocop:enable Metrics/LineLength
18 changes: 14 additions & 4 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,14 @@ GIT
aws-sdk-sqs (~> 1)
rails (>= 4.2)

GIT
remote: https://github.com/samvera-labs/hyku_knapsack.git
revision: 47bbb4a95ff7b94c06e6ee1700c47fac1e728a0f
branch: upstream_main
specs:
hyku_knapsack (0.0.1)
rails (>= 5.2.0)

GIT
remote: https://github.com/samvera-labs/hyrax-doi.git
revision: d494a50ef8ce3eae594c7ed7148c33b3c977d4a7
Expand Down Expand Up @@ -164,7 +172,7 @@ GEM
babel-transpiler (0.7.0)
babel-source (>= 4.0, < 6)
execjs (~> 2.0)
bagit (0.4.5)
bagit (0.4.6)
docopt (~> 0.5.0)
validatable (~> 1.6)
bcp47 (0.3.3)
Expand Down Expand Up @@ -250,9 +258,10 @@ GEM
signet (~> 0.8)
typhoeus
builder (3.2.4)
bulkrax (5.2.1)
bulkrax (5.3.0)
bagit (~> 0.4)
coderay
dry-monads (~> 1.4.0)
iso8601 (~> 0.9.0)
kaminari
language_list (~> 1.2, >= 1.2.1)
Expand Down Expand Up @@ -1290,7 +1299,7 @@ DEPENDENCIES
blacklight_oai_provider (~> 6.1, >= 6.1.1)
bolognese (>= 1.9.10)
bootstrap-datepicker-rails
bulkrax (~> 5.0)
bulkrax (~> 5.3)
byebug
capybara
capybara-screenshot (~> 1.0)
Expand All @@ -1309,6 +1318,7 @@ DEPENDENCIES
fcrepo_wrapper (~> 0.4)
flipflop (~> 2.6.0)
flutie
hyku_knapsack!
hyrax (~> 3.5.0)
hyrax-doi!
hyrax-iiif_av!
Expand Down Expand Up @@ -1367,4 +1377,4 @@ DEPENDENCIES
webmock

BUNDLED WITH
2.1.4
2.4.14
18 changes: 9 additions & 9 deletions config/authorities/licenses.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,28 +25,28 @@ terms:
active: true
- id: http://creativecommons.org/licenses/by/3.0/us/
term: Attribution 3.0 United States
active: false
active: true
- id: http://creativecommons.org/licenses/by-sa/3.0/us/
term: Attribution-ShareAlike 3.0 United States
active: false
active: true
- id: http://creativecommons.org/licenses/by-nc/3.0/us/
term: Attribution-NonCommercial 3.0 United States
active: false
active: true
- id: http://creativecommons.org/licenses/by-nd/3.0/us/
term: Attribution-NoDerivs 3.0 United States
active: false
active: true
- id: http://creativecommons.org/licenses/by-nc-nd/3.0/us/
term: Attribution-NonCommercial-NoDerivs 3.0 United States
active: false
active: true
- id: http://creativecommons.org/licenses/by-nc-sa/3.0/us/
term: Attribution-NonCommercial-ShareAlike 3.0 United States
active: false
active: true
- id: http://creativecommons.org/publicdomain/mark/1.0/
term: Public Domain Mark 1.0
active: false
active: true
- id: http://creativecommons.org/publicdomain/zero/1.0/
term: CC0 1.0 Universal
active: false
active: true
- id: http://www.europeana.eu/portal/rights/rr-r.html
term: All rights reserved
active: false
active: true
3 changes: 2 additions & 1 deletion config/database.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ login: &login
database: <%= ENV['DB_NAME'] || 'hyku' %>
pool: 50
timeout: 5000

prepared_statements: <%= ENV.fetch('DB_PREPARED_STATEMENTS', true) %>
advisory_locks: <%= ENV.fetch('DB_ADVISORY_LOCKS', true) %>

development:
<<: *login
Expand Down
27 changes: 26 additions & 1 deletion config/initializers/bulkrax.rb
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,34 @@
# e.g. to exclude date
# config.field_mappings["Bulkrax::OaiDcParser"]["date"] = { from: ["date"], excluded: true }


default_field_mapping = {
'abstract' => { from: ['abstract'], split: true },
'alternative_title' => { from: ['alternative_title'], split: /\s*[|]\s*/ },
'based_near' => { from: ['based_near'], split: true },
'bibliographic_citation' => { from: ['bibliographic_citation'], split: true },
'contributor' => { from: ['contributor'], split: true },
'create_date' => { from: ['create_date'], split: true },
'children' => { from: ['children'], related_children_field_mapping: true },
'creator' => { from: ['creator'], split: true },
'date_created' => { from: ['date_created'], split: true },
'description' => { from: ['description'], split: true },
'extent' => { from: ['extent'], split: true },
'file' => { from: ['file'], split: /\s*[|]\s*/ },
'identifier' => { from: ['identifier'], split: true },
'keyword' => { from: ['keyword'], split: true },
'language' => { from: ['language'], split: true },
'license' => { from: ['license'], split: /\s*[|]\s*/ },
'modified_date' => { from: ['modified_date'], split: true },
'parents' => { from: ['parents'], related_parents_field_mapping: true },
'children' => { from: ['children'], related_children_field_mapping: true }
'publisher' => { from: ['publisher'], split: true },
'related_url' => { from: ['related_url'], split: /\s* [|]\s*/ },
'remote_files' => { from: ['remote_files'], split: /\s*[|]\s*/},
'resource_type' => { from: ['resource_type'], split: true },
'rights_notes' => { from: ['rights_notes'], split: true },
'source' => { from: ['source'], split: true },
'subject' => { from: ['subject'], split: true },
'title' => { from: ['title'], split: /\s*[|]\s*/ }
}

config.field_mappings["Bulkrax::BagitParser"] = default_field_mapping.merge({
Expand Down
18 changes: 13 additions & 5 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
version: '3.8'

x-app: &app
build:
context: .
target: hyku-web
args:
BUILDKIT_INLINE_CACHE: 1
cache_from:
- ghcr.io/samvera/hyku/base:${TAG:-latest}
- ghcr.io/samvera/hyku:${TAG:-latest}
- ghcr.io/samvera/hyku/base:latest
- ghcr.io/samvera/hyku:latest
image: ghcr.io/samvera/hyku:${TAG:-latest}
env_file:
- .env
Expand Down Expand Up @@ -126,6 +127,10 @@ services:
build:
context: .
target: hyku-base
cache_from:
- ghcr.io/samvera/hyku/base:latest
args:
BUILDKIT_INLINE_CACHE: 1

web:
<<: *app
Expand Down Expand Up @@ -164,9 +169,12 @@ services:
build:
context: .
target: hyku-worker
args:
BUILDKIT_INLINE_CACHE: 1
cache_from:
- ghcr.io/samvera/hyku:${TAG:-latest}
- ghcr.io/samvera/hyku/worker:${TAG:-latest}
- ghcr.io/samvera/hyku/base:latest
- ghcr.io/samvera/hyku:latest
- ghcr.io/samvera/hyku/worker:latest
command: bundle exec sidekiq
depends_on:
check_volumes:
Expand Down
Loading