-
Notifications
You must be signed in to change notification settings - Fork 2
Bulk reindexing
The easiest way to bulk reindex is to use ActiveMQ to batch all the druids that you'd like to reindex. There is a pids_to_reindex
folder in SULMQ's folders (see /opt/app/karaf/current
). You need only copy a file that contains a list of druids (fully qualified like druid:aa111bb2222
) into the folder. The message broker will pick up that file and remove it after it enqueues all of the messages for it. Then, the broker will do all the work of making the calls to the dor_indexing_app
service to do the reindexing.
Our ActiveMQ broker is configured to reindex objects in the background whenever the reindexing pipeline is idle. It picks the N oldest objects in the index and reindexes them. It takes about 3 days to go through the entire index using this background method.
To do a completely clean reindex, you would need to extract all the pids from Fedora's database, order them by object type (so that APOs are first, for example), and then feed the pids to the message broker as describe above. See the Argo::PidGatherer
class for an example.