Skip to content

Similarity

Brian Tsau edited this page Jun 11, 2019 · 1 revision

Similarity (package edu.columbia.cs.psl.ioclones.sim)

AbstractSim

Implements SimAnalyzer . An abstract class that serves as a template for FastAnalyzer and NoOrderAnalyzer, providing some methods that should be common across all analyzers in the method I/O comparison step of HitoshiIO.

FastAnalyzer

Extends AbstractSim. Used to compute the Jaccard similarity between the deephashed sets of two collections of objects. Additional similarity scores were implemented, but hardcoded constants in FastAnalyzer must be changed in order to use these.

NoOrderAnalyzer

Extends AbstractSim. Has been somewhat deprecated, and is now only used to clean and store I/Os in a serialized form. Contains a number of hardcoded filepaths.

SimAnalyzer

An interface for other analyzers, that contains the basic necessary methods for an analyzer:

  • compareObject - a similarity score (in range [0,1]) for two Objects that takes into account possibilities of different common data structures in Java
  • similarity - a similarity score (in range [0,1]) based on two sets of Objects

Documentation belonging to the development of HitoshiIO can be found here. However, some files are still poorly documented, and some are not used at all in the current version of the system and exist for further development/research.

Below is the organization of HitoshiIO's source code:

  1. Root
  2. Analysis
  3. Config
  4. Driver
  5. Instrument
  6. Pojo
  7. Premain
  8. Similarity
  9. Utilities
  10. Xml Converter
Clone this wiki locally