Skip to content

Commit

Permalink
Implement Detector, Category, DetectorCategory
Browse files Browse the repository at this point in the history
** Why are these changes being introduced:

Our chosen architecture calls for a set of models that will comprise a
sort of "knowledge graph", which TACOS will consult during the
categorization process. This includes classes for Category, Detector,
and a linking DetectorCategory class. The Detector and DetectorCategory
classes will each define a confidence value.

** Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/tco-82

** How does this address that need:

This defines those classes. The migration includes the creation of the
needed records for each class: Three category records, six detectors,
and five records which link between them.

There are currently no detectors which map to two of the categories,
although we have talked about those being needed. Additionally, one
detector, SuggestedResource, is unique in that specific records will
count toward each category - so it isn't appropriate to have a link
record which uniformly connects to only one category.

** Document any side effects to this change:

We previously had a Detector model file, but it was set as a Module in
order to provide a namespace for its subclasses. This has been updated
to be just a class, which impacts the dashboard and test files. Also,
there was a method to define a table name prefix, which has been moved
from the Detector file to the subclass files.
  • Loading branch information
matt-bernhardt committed Sep 10, 2024
1 parent 02fc057 commit cabdf53
Show file tree
Hide file tree
Showing 12 changed files with 136 additions and 14 deletions.
2 changes: 1 addition & 1 deletion app/dashboards/detector/suggested_resource_dashboard.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

require 'administrate/base_dashboard'

module Detector
class Detector
class SuggestedResourceDashboard < Administrate::BaseDashboard
# ATTRIBUTE_TYPES
# a hash that describes the type of each of the model's fields.
Expand Down
16 changes: 16 additions & 0 deletions app/models/category.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# frozen_string_literal: true

# == Schema Information
#
# Table name: categories
#
# id :integer not null, primary key
# name :string
# description :text
# created_at :datetime not null
# updated_at :datetime not null
#
class Category < ApplicationRecord
has_many :detector_categories
has_many :detectors, through: :detector_categories
end
17 changes: 13 additions & 4 deletions app/models/detector.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
# frozen_string_literal: true

# == Schema Information
#
# Table name: detectors
#
# id :integer not null, primary key
# name :string
# confidence :float
# created_at :datetime not null
# updated_at :datetime not null
#
# Detectors are classes that implement various algorithms that allow us to identify patterns
# within search terms.
module Detector
def self.table_name_prefix
'detector_'
end
class Detector < ApplicationRecord
has_many :detector_categories
has_many :categories, through: :detector_categories
end
6 changes: 5 additions & 1 deletion app/models/detector/journal.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,16 @@
# created_at :datetime not null
# updated_at :datetime not null
#
module Detector
class Detector
# Detector::Journal stores information about academic journals loaded from external sources to allow us to check our
# incoming Terms against these information
class Journal < ApplicationRecord
before_save :downcase_fields!

def self.table_name_prefix
'detector_'
end

# Identify journals in which the incoming phrase matches a Journal.name exactly
#
# @note We always store the Journal.name downcased, so we should also always downcase the phrase
Expand Down
6 changes: 5 additions & 1 deletion app/models/detector/standard_identifiers.rb
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
# frozen_string_literal: true

module Detector
class Detector
# Detector::StandardIdentifiers detects the identifiers DOI, ISBN, ISSN, PMID.
# See /docs/reference/pattern_detection_and_enhancement.md for details.
class StandardIdentifiers
attr_reader :identifiers

def self.table_name_prefix
'detector_'
end

def initialize(term)
@identifiers = {}
term_pattern_checker(term)
Expand Down
6 changes: 5 additions & 1 deletion app/models/detector/suggested_resource.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,17 @@

require 'stringex/core_ext'

module Detector
class Detector
# Detector::SuggestedResource stores custom hints that we want to send to the
# user in response to specific strings. For example, a search for "web of
# science" should be met with our custom login link to Web of Science via MIT.
class SuggestedResource < ApplicationRecord
before_save :update_fingerprint

def self.table_name_prefix
'detector_'
end

# This exists for the before_save lifecycle hook to call the calculate_fingerprint method, to ensure that these
# records always have a correctly-calculated fingerprint. It has no arguments and returns nothing.
def update_fingerprint
Expand Down
17 changes: 17 additions & 0 deletions app/models/detector_category.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# frozen_string_literal: true

# == Schema Information
#
# Table name: detector_categories
#
# id :integer not null, primary key
# detector_id :integer not null
# category_id :integer not null
# confidence :float
# created_at :datetime not null
# updated_at :datetime not null
#
class DetectorCategory < ApplicationRecord
belongs_to :category
belongs_to :detector
end
42 changes: 42 additions & 0 deletions db/migrate/20240909183413_create_categories.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
class CreateCategories < ActiveRecord::Migration[7.1]
def change
create_table :detectors do |t|
t.string :name
t.float :confidence

t.timestamps
end

create_table :categories do |t|
t.string :name
t.text :description

t.timestamps
end

create_table :detector_categories do |t|
t.belongs_to :detector, null: false, foreign_key: true
t.belongs_to :category, null: false, foreign_key: true
t.float :confidence

t.timestamps
end

Detector.create(name: 'DOI', confidence: 0.95)
Detector.create(name: 'ISBN', confidence: 0.8)
Detector.create(name: 'ISSN', confidence: 0.6)
Detector.create(name: 'PMID', confidence: 0.95)
Detector.create(name: 'Journal', confidence: 0.2)
Detector.create(name: 'SuggestedResource', confidence: 0.95)

Category.create(name: 'Informational', description: 'A type of search where the user is looking for broad information, rather than an individual item. Also known as "open-ended" or "topical".')
Category.create(name: 'Navigational', description: 'A type of search where the user has a location in mind, and wants to go there. In library discovery, this should mean a URL that will not be in the searched index.')
Category.create(name: 'Transactional', description: 'A type of search where the user has an item in mind, and wants to get that item. Also known as "known-item".')

DetectorCategory.create(detector: Detector.find_by(name: 'DOI'), category: Category.find_by(name: 'Transactional'), confidence: 0.95)
DetectorCategory.create(detector: Detector.find_by(name: 'ISBN'), category: Category.find_by(name: 'Transactional'), confidence: 0.95)
DetectorCategory.create(detector: Detector.find_by(name: 'ISSN'), category: Category.find_by(name: 'Transactional'), confidence: 0.95)
DetectorCategory.create(detector: Detector.find_by(name: 'PMID'), category: Category.find_by(name: 'Transactional'), confidence: 0.95)
DetectorCategory.create(detector: Detector.find_by(name: 'Journal'), category: Category.find_by(name: 'Transactional'), confidence: 0.5)
end
end
28 changes: 27 additions & 1 deletion db/schema.rb

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions test/models/detector/journal_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
#
require 'test_helper'

module Detector
class Detector
class JournalTest < ActiveSupport::TestCase
test 'exact term match on journal name' do
expected = detector_journals('the_new_england_journal_of_medicine')
Expand Down Expand Up @@ -57,4 +57,4 @@ class JournalTest < ActiveSupport::TestCase
assert_equal(mixed_case.downcase, actual.name)
end
end
end
end
4 changes: 2 additions & 2 deletions test/models/detector/standard_identifiers_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

require 'test_helper'

module Detector
class Detector
class StandardIdentifiersTest < ActiveSupport::TestCase
test 'ISBN detected in a string' do
actual = Detector::StandardIdentifiers.new('test 978-3-16-148410-0 test').identifiers
Expand Down Expand Up @@ -191,4 +191,4 @@ class StandardIdentifiersTest < ActiveSupport::TestCase
end
end
end
end
end
2 changes: 1 addition & 1 deletion test/models/detector/suggested_resource_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
#
require 'test_helper'

module Detector
class Detector
class SuggestedResourceTest < ActiveSupport::TestCase
test 'fingerprints are generated automatically' do
resource = {
Expand Down

0 comments on commit cabdf53

Please sign in to comment.