Skip to content

BagIt Importer

Julie Allinson edited this page Jan 15, 2020 · 9 revisions

BagIt Importer

Bulkrax can import valid BagIt bags, either individually, or multiple bags in a single folder. The bag, or folder of bags may be supplied in a zip file.

The Bag(s)

Bulkrax assumes that each bag will contain one or more works, with each work having an individual metadata file and zero, one or more data files.

A Single Bag

This single bag containing two images and one metadata file would be imported as a single Work with two files attached. The metadata file (my_metadata.csv) can be at the top level as it is here, or it can be in the data folder.

my_bag
  data
    my_image.tif
    my_other_image.jpg
  my_metadata.csv
  (bagit files)

Multiple Bags

This folder would import each bag as a separate work - 3 works in total.

folder
  my_bag
    data
      my_image.tif
    my_metadata.csv
    (bagit files)
  my_second_bag
    (structured as per my_bag)
  my_third_bag
    (ditto)

Multiple Works

This bag would be unpacked to create three works, one per metadata file.

my_bag
  data
    work1
      my_image1.tif
      my_metadata.csv
    work2
      my_image2.tif
      my_metadata.csv
    work3
      my_image3.tif
      my_metadata.csv
   (bagit files)

Metadata

  • There MUST be one metadata file per work. If a CSV is supplied, only the first line (after the header) will be processed.
  • If there are multiple bags, or multiple works, each metadata file MUST have the same filename and MUST be co-located with the data files (as per the example above).
  • Metadata can be supplied as RDF or CSV.

Creating Bags for Import

There are various tools for creating BagIt bags. For example, using the ruby 'bagit' library in an irb console:

gem install bagit
irb
> require 'bagit'
> # make a new bag from existing files
> bag = BagIt::Bag.new path_to_files
> bag.manifest!
Clone this wiki locally