Skip to content

Commit

Permalink
Merge pull request #37 from hudmol/handles
Browse files Browse the repository at this point in the history
Handles
  • Loading branch information
marktriggs authored Jun 29, 2017
2 parents afd9fcc + 2534ab5 commit e118acb
Show file tree
Hide file tree
Showing 13 changed files with 221 additions and 9 deletions.
40 changes: 40 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,22 @@ And shut it down like this:
$ cd /path/to/archivesspace_export_service/exporter_app
$ bin/shutdown.sh

The exporter application now uses gems. If running from source
you will need to dowload the required gems like this:

$ cd /path/to/archivesspace_export_service/exporter_app
$ bin/bootstrap.sh

This step is not required if running from a distributed release.

UPGRADE NOTE: If upgrading from v1.0, you will need to remove
the ead export work queue database before starting the application,
like this:

$ cd /path/to/archivesspace_export_service/exporter_app
$ rm workspace/ead/db/ead_export.sqlite


See below for configuration options.


Expand All @@ -87,6 +103,24 @@ frontend web UI. If the Exporter Application is deployed on a
different machine from ArchivesSpace you may need to configure your
firewall to open the backend port.

If you intend to use the Handle creation feature, you will also
need to set some handle related configuration options, like this:

{
handle_wsdl_url: 'http://link.its.yale.edu/ypls-ws/PersistentLinking?wsdl',
handle_user: '10079.1/FA',
handle_credential: '[YOUR CREDENTIAL]',
handle_prefix: '10079.1/fa',
handle_group: '10079.1/FA',
handle_base: 'http://archives.yale.edu',
}

And if using Handles, the configured ArchivesSpace user (`a_user` above)
will need permissions to update resources on any ArchivesSpace repository
from which resources will be exported, in addition to the permission
discussed above. This is because generated Handles are written back to
the resource (in the ead_location field).


## How it works

Expand Down Expand Up @@ -235,6 +269,12 @@ of `:task_parameters`. These are as follows:

* `:numbered_cs` - Use numbered c tags in ead (default: false)

* `:generate_handles` - If set to `true` then a Handle will be created
immediately before export for any resources that have a value in
`ead_id` but do not have a value in `ead_location`. The created
Handle will be written back to the resource in the `ead_location`
field.

The sample `jobs.rb` file shows a fully configured ExportEADTask which
makes use of `:after_hooks` (described below) to additionally produce
PDF versions of finding aids and a table of contents. Note that the
Expand Down
13 changes: 7 additions & 6 deletions backend/model/resource_update_monitor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,42 +3,42 @@ class ResourceUpdateMonitor
CHANGED_RECORD_QUERIES = {

:updated_resources =>
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.repo_id, r.publish, r.suppressed' +
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.ead_location, r.repo_id, r.publish, r.suppressed' +
' from resource r' +
' where system_mtime >= ?'),

:updated_archival_objects =>
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.repo_id, r.publish, r.suppressed ' +
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.ead_location, r.repo_id, r.publish, r.suppressed ' +
' from resource r' +
' inner join archival_object ao on ao.root_record_id = r.id' +
' where ao.system_mtime >= ?'),

:updated_digital_object_via_resource =>
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.repo_id, r.publish, r.suppressed' +
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.ead_location, r.repo_id, r.publish, r.suppressed' +
' from digital_object do' +
' inner join instance_do_link_rlshp rlshp on rlshp.digital_object_id = do.id' +
' inner join instance i on i.id = rlshp.instance_id' +
' inner join resource r on r.id = i.resource_id' +
' where do.system_mtime >= ?'),

:updated_digital_object_via_ao =>
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.repo_id, r.publish, r.suppressed from digital_object do' +
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.ead_location, r.repo_id, r.publish, r.suppressed from digital_object do' +
' inner join instance_do_link_rlshp rlshp on rlshp.digital_object_id = do.id' +
' inner join instance i on i.id = rlshp.instance_id' +
' inner join archival_object ao on ao.id = i.archival_object_id' +
' inner join resource r on r.id = ao.root_record_id' +
' where do.system_mtime >= ?'),

:updated_digital_object_component_via_resource =>
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.repo_id, r.publish, r.suppressed from digital_object_component doc' +
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.ead_location, r.repo_id, r.publish, r.suppressed from digital_object_component doc' +
' inner join digital_object do on doc.root_record_id = do.id' +
' inner join instance_do_link_rlshp rlshp on rlshp.digital_object_id = do.id' +
' inner join instance i on i.id = rlshp.instance_id' +
' inner join resource r on r.id = i.resource_id' +
' where doc.system_mtime >= ?'),

:updated_digital_object_component_via_ao =>
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.repo_id, r.publish, r.suppressed from digital_object_component doc' +
('select DISTINCT r.id, r.title, r.identifier, r.ead_id, r.ead_location, r.repo_id, r.publish, r.suppressed from digital_object_component doc' +
' inner join digital_object do on doc.root_record_id = do.id' +
' inner join instance_do_link_rlshp rlshp on rlshp.digital_object_id = do.id' +
' inner join instance i on i.id = rlshp.instance_id' +
Expand Down Expand Up @@ -133,6 +133,7 @@ def updates_since(timestamp)
'id' => res[:id],
'title' => res[:title],
'ead_id' => res[:ead_id],
'ead_location' => res[:ead_location],
'identifier' => JSON.parse(res[:identifier]),
'repo_id' => res[:repo_id],
'uri' => JSONModel(:resource).uri_for(res[:id], :repo_id => res[:repo_id]),
Expand Down
3 changes: 3 additions & 0 deletions exporter_app/Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
source 'http://rubygems.org'

gem 'savon', '~> 2.11', '>= 2.11.1'
40 changes: 40 additions & 0 deletions exporter_app/Gemfile.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
GEM
remote: http://rubygems.org/
specs:
akami (1.3.1)
gyoku (>= 0.4.0)
nokogiri
builder (3.2.3)
gyoku (1.3.1)
builder (>= 2.1.2)
httpi (2.4.2)
rack
socksify
mini_portile2 (2.1.0)
nokogiri (1.6.8.1)
mini_portile2 (~> 2.1.0)
nokogiri (1.6.8.1-java)
nori (2.6.0)
rack (1.6.8)
savon (2.11.1)
akami (~> 1.2)
builder (>= 2.1.2)
gyoku (~> 1.2)
httpi (~> 2.3)
nokogiri (>= 1.4.0)
nori (~> 2.4)
wasabi (~> 3.4)
socksify (1.7.1)
wasabi (3.5.0)
httpi (~> 2.0)
nokogiri (>= 1.4.2)

PLATFORMS
java
ruby

DEPENDENCIES
savon (~> 2.11, >= 2.11.1)

BUNDLED WITH
1.15.1
12 changes: 12 additions & 0 deletions exporter_app/bin/bootstrap.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash

BASEDIR=$(dirname "$0")/../

export GEM_HOME="$BASEDIR/gems"
export GEM_PATH="$BASEDIR/gems"

cd "$BASEDIR"

java -cp bin/jruby-complete-9.1.0.0.jar org.jruby.Main -S gem install bundler

java -cp bin/jruby-complete-9.1.0.0.jar org.jruby.Main gems/bin/bundle install
3 changes: 3 additions & 0 deletions exporter_app/bin/startup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

BASEDIR=$(dirname "$0")/../

export GEM_HOME="$BASEDIR/gems"
export GEM_PATH="$BASEDIR/gems"

cd "$BASEDIR"
mkdir -p logs
exec java $JAVA_OPTS -Darchivesspace-exporter=yes -Dfile.encoding=UTF-8 -cp "bin/*:java_lib/*:$CLASSPATH" org.jruby.Main -- exporter_app.rb 2>logs/exporter_app.err
8 changes: 8 additions & 0 deletions exporter_app/config/config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,13 @@
aspace_username: 'admin',
aspace_password: 'admin',
aspace_backend_url: 'http://localhost:4567/',

handle_wsdl_url: 'http://link.its.yale.edu/ypls-ws/PersistentLinking?wsdl',
handle_user: '10079.1/FA',
handle_credential: '[YOUR CREDENTIAL]',
handle_prefix: '10079.1/fa',
handle_group: '10079.1/FA',
handle_base: 'http://archives.yale.edu',

log_level: 'debug'
}
2 changes: 2 additions & 0 deletions exporter_app/config/jobs.rb
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
:numbered_cs => false
},

:generate_handles => true,

:xslt_transforms => ['config/transform.xslt'],
:validation_schema => ['config/ead.xsd'],
:schematron_checks => ['config/schematron.sch'],
Expand Down
3 changes: 3 additions & 0 deletions exporter_app/exporter_app.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
require 'bundler/setup'
Bundler.require

class ExporterApp

POLL_INTERVAL = 60
Expand Down
25 changes: 25 additions & 0 deletions exporter_app/tasks/export_ead_task.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
require_relative 'lib/xslt_processor'
require_relative 'lib/sqlite_work_queue'
require_relative 'lib/archivesspace_client'
require_relative 'lib/handle_client'
require_relative 'lib/validation_failed_exception'

class ExportEADTask < TaskInterface
Expand All @@ -31,6 +32,15 @@ def initialize(task_params, job_identifier, workspace_base)
config = ExporterApp.config
@as_client = ArchivesSpaceClient.new(config[:aspace_backend_url], config[:aspace_username], config[:aspace_password])

if (@generate_handles = task_params.fetch(:generate_handles, false))
@handle_client = HandleClient.new(config[:handle_wsdl_url],
config[:handle_user],
config[:handle_credential],
config[:handle_prefix],
config[:handle_group],
config[:handle_base])
end

@commit_every_n_records = task_params.fetch(:commit_every_n_records, nil)
@records_added = 0

Expand Down Expand Up @@ -66,6 +76,7 @@ def call(process)
while (still_running = process.running?) && !max_records_hit? && (item = @work_queue.next)
if item[:action] == 'add'
begin
ensure_handle(item) if @generate_handles
download_ead(item)
create_manifest_json(item)
rescue SkipRecordException
Expand Down Expand Up @@ -152,6 +163,20 @@ def path_for_export_file(basename, extension = 'xml')
File.join(output_directory, "#{basename}.#{extension}")
end

def ensure_handle(item)
@log.info("Ensuring there is a handle for #{item[:uri]}")
@log.debug("ead_id: '#{item[:ead_id]}', ead_location: '#{item[:ead_location]}'")

if !item[:ead_location] && item[:ead_id]
handle = @handle_client.create_handle(item[:ead_id], item[:uri])
@log.debug("Created handle: #{handle}")
response = @as_client.update_record(item[:uri], 'ead_location' => handle)
@log.debug("Updated resource: #{response}")
else
@log.debug("No need to create handle")
end
end

def download_ead(item)
@log.info("Downloading EAD for #{item[:uri]}")
id = item.fetch(:resource_id)
Expand Down
17 changes: 14 additions & 3 deletions exporter_app/tasks/lib/archivesspace_client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,28 @@ def export(id, repo_id, opts = {})
get("/repositories/#{repo_id}/resource_descriptions/#{id}.xml", opts)
end

def update_record(uri, hash)
json_post(uri, json_get(uri).merge(hash), true)
end

private

def login
json_post("/users/#{@username}/login", :password => @password, :expiring => false)['session']
end

def json_post(path, params)
def json_post(path, params, body = false)
uri = URI.join(@aspace_backend_url, @aspace_backend_path, path.gsub(/^\//,""))

request = Net::HTTP::Post.new(uri)
request.form_data = params
request['X-ArchivesSpace-Session'] = @session if @session

if body
request['Content-Type'] = 'text/json'
request.body = JSON.generate(params)
else
request.form_data = params
end

http = Net::HTTP.new(uri.host, uri.port)

Expand Down Expand Up @@ -75,7 +86,7 @@ def get(path, params)
response.body
end

def json_get(uri, params)
def json_get(uri, params = {})
JSON(get(uri, params))
end
end
63 changes: 63 additions & 0 deletions exporter_app/tasks/lib/handle_client.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
require 'savon'

class HandleClient

HANDLE_HOST = 'http://hdl.handle.net'

def initialize(wsdl_url, user, credential, prefix, group, handle_base)
@wsdl_url = wsdl_url
@user = user
@credential = credential
@prefix = prefix
@group = group
@handle_base = handle_base

# looks like the namespace stuff from the example code isn't required
# leaving it commented for now in case it is needed later
# @namespace = @prefix.sub(/.*?\//, '') + ':'

@client = Savon.client(wsdl: @wsdl_url)
end

def create_handle(id, uri)
# unless id.include?(@namespace)
# raise "Handle prefix namespace '#{@namespace}' doesn't match namespace of id '#{id}'"
# end
# handle = [@prefix, id.sub(@namespace, '')].join('/')

handle = [@prefix, id].join('/')

response = @client.call(:create_batch_semantic, xml: soap_envelope(handle, uri))

unless response.success?
raise "Failed to create handle for id '#{id}' with uri '#{uri}': #{response.to_xml.to_s}"
end

[HANDLE_HOST, handle].join('/')
end

private

def soap_envelope(handle, uri)
<<-EOT
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Body>
<tns:createBatchSemantic xmlns:tns="http://ws.ypls.its.yale.edu/">
<handlesToValues>
<map>
<entry>
<key>#{handle.encode(:xml => :text)}</key>
<value>#{@handle_base.encode(:xml => :text)}#{uri.encode(:xml => :text)}</value>
</entry>
</map>
</handlesToValues>
<group>#{@group.encode(:xml => :text)}</group>
<user>#{@user.encode(:xml => :text)}</user>
<credential>#{@credential.encode(:xml => :text)}</credential>
</tns:createBatchSemantic>
</env:Body>
</env:Envelope>
EOT
end

end
1 change: 1 addition & 0 deletions exporter_app/tasks/lib/sqlite_work_queue.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ class SQLiteWorkQueue
{:name => 'title', :sqltype => 'text', :jdbctype => 'string'},
{:name => 'uri', :sqltype => 'text', :jdbctype => 'string'},
{:name => 'ead_id', :sqltype => 'text', :jdbctype => 'string'},
{:name => 'ead_location', :sqltype => 'text', :jdbctype => 'string'},
]

def initialize(db_file)
Expand Down

0 comments on commit e118acb

Please sign in to comment.