Skip to content

Latest commit

 

History

History
1044 lines (761 loc) · 40.1 KB

HOWTO_Munin.adoc

File metadata and controls

1044 lines (761 loc) · 40.1 KB

HOWTO build and run the Protocol Verifier

Enough talk! Skip to the steps.

If you want to run the latest version of the Protocol Verifier without building, see here.

Introduction

This HOWTO is targeted primarily to new developers. It gives background on setting up a build environment, building the application and running it.

Prerequisites

Before you get started, take a few moments to be sure you have the following pieces in place.

Docker

If you plan to build and run the models, Docker is required for these processes. Docker is a containerisation tool which can be used to produce an application image that can be executed consistently on any platform which supports the docker engine.

Two Docker terms you should be familiar with before you get started are "image" and "container". An image is the blueprint for a container. It represents the initial state of the filesystem when a container is created and carries metadata about the image and the execution of the containerised application. You can think of a container as an instance of an image. A container is created from an image when the user requests it to run. Many containers can be created from the same image simultaneously and are isolated from one another. Even after the process inside the container exits, the container still exists and its state can be interrogated until it is deleted by the user.

Follow the instructions here to install Docker Desktop.

BridgePoint

Install the latest BridgePoint release. If you already have BridgePoint installed, confirm that your version is more recent than 7.6.0 (2023-12-21) by checking the build ID.

Refer to this guide for help installing BridgePoint and checking the build ID.

Git

BridgePoint comes packaged with everything you need to interact with the source git repositories. That being said, some prefer to use a different GUI client application or use the command line version of the tool.

Refer to the Git guide for a quick start on using git to clone repositories and import projects.

PlantUML

PlantUML is a tool for manipulating UML diagrams specified in a textual format, which for the Munin project is constrained as the 'PLUS' domain-specific language. PLUS is based on PlantUML Activity Diagrams. PlantUML can be downloaded from various sources. The recommended approach is to load it as an Eclipse plugin so that it works seamlessly with BridgePoint. The Eclipse plugin can be downloaded here and the update site is here.

Getting PlantUML Working within BridgePoint

To see PlantUML in BridgePoint, select the Window menu item, then select Show View and then Other. This will open a dialog showing a PlantUML folder. Open the folder and select PlantUML. Again using Window and Show View, open a Project Explorer view. This will appear as a tab alongside the Model Explorer and provides a view of the file structure. Sample PlantUML files with the filename extension .puml can be found here.

When first opening a .puml file right click on the file and select Open With…​ then Text Editor. Position the text editor pane and the PlantUML graphic pane side by side. When the PlantUML text is edited the activity diagram is updated automatically.

Whilst this is a generic PlantUML editor, it is used specifically for editing the PLUS domain specific language. For details on PLUS see here. For a tutorial on PLUS and explanation of supported Protocol Verifier topologies, see here.

PlantUML and the PLUS language represent the "front end" of the Protocol Verifier. Together with the plus2json utility, this is how the Protocol Verifier is configured and managed.

Source Projects

It is necessary to have the source code/models downloaded onto your machine. The source code can be browsed on GitHub. Follow the steps from the guide linked in the previous section to clone this repository from GitHub. Follow the steps from the guide linked in the BridgePoint section to import the following projects into a fresh workspace:

  • Protocol Verifier Domains

    • JobManagement (JM)

    • AEOrdering (AEO)

    • FileReception (FR) - only required if input event streams are file based

    • SequenceVerificationDataCentric (SVDC)

    • InvariantStore (IStore)

    • Verification Gateway (VG)

  • Deployments

    • AEO_SVDC (AEO + SVDC + VG)

    • IStore (a single domain deployment)

    • JM (a single domain deployment)

  • Utilities

    • utils

    • plus2json

    • kafka

    • zookeeper

Terminology

Project

The term "project" typically refers to the model source and supporting build/configuration files for a single unit of functionality. In our context, this also maps one-to-one with an Eclipse/BridgePoint project.

Domain

A "domain" represents an encapsulated set of data objects and behaviours. The term comes from the original Shlaer-Mellor Method, and in MASL a domain is the basic unit of compilation. Each domain exists in its own project.

Deployment

A "deployment" represents a combination of one or more domains including the modelled connections between them. Deployments map to "project" in textual MASL form, but the term "deployment" is preferred to avoid confusion with the more general term "project".

Protocol Verifier

"Protocol Verifier" refers to the overall application as well as the BridgePoint project where all of the domains are integrated together using a set of deployments. The Protocol Verifier is built to use kafka as an inter-process messaging service. However, if kafka is not present, then the Protocol Verifier will revert to file-based transfers. This includes getting event streams into the PV. The kafka configuration is described below.

In our context the Protocol Verifier is the collections of the following deployments:

  • JM

  • IStore

  • AEO_SVDC (includes Verification Gateway)

PLUS

"PLUS" is the name of the domain specific language for specifying valid job sequences for the protocol verifier. PLUS is defined as a subset of constructs from the PlantUML activity diagram syntax. A full description and guide for PLUS can be found here.

plus2json

"plus2json" refers to the application which processes PLUS files (textual .puml activity diagrams) and produces JSON job definition and runtime configuration files consumable by the Protocol Verifier. plus2json can also be used to generate runtime test data based on the job definitions. It is capable of generating valid runtime event streams and also of injecting errors into runtime event streams.

plus2json source code, documentation and examples/tests can be found in a git repository here.

Overview

Toolchain

The Munin project team has been using the Shlaer-Mellor Method to model the problem and our solution. The following sections describe the set of tools we are using to actualise our designs.

xtUML/BridgePoint

The source models are represented graphically in xtUML using the BridgePoint editor. Action semantics are encoded using MASL. The MASL specification also defines all necessary "structural" constructs (e.g. classes, relationships, state machines), but does not provide a specification for capturing graphical layout. In addition, there is no existing graphical tool which supports direct edit of MASL models. BridgePoint provides the graphical editing experience required for structural Shlaer-Mellor modelling.

MASL

At build time, BridgePoint is used to export the xtUML models to textual MASL format. As mentioned in the previous section, MASL is capable of representing the complete semantics of the S-M method including structural elements. In addition, MASL is required to be compatible with our selected backend code generator (see next section).

MASL C++ code generator and software architecture (via Docker)

The code generator and target architecture chosen for Munin is the MASL C software architecture published as Open Source Software in 2016. The code generator consists of a Java-based MASL parser/generator which produces C source code for an application model. The code generator is designed to be modular with a core translator and a set of peripheral translators that provide additional capabilities such as Sqlite persistence, build file generation, runtime model debugging, etc. The companion software architecture is a set of runtime libraries written in C++ which provide mechanisms to implement the rules of Shlaer-Mellor in a single threaded process. The code generator is designed to produce generated code compatible with the runtime architecture.

Testing and GitHub Actions (Continuous Integration)

We are using features of the MASL code generator along with custom domains to define and run tests for each domain and the whole system. Like all other actions, the tests are defined in MASL. A domain service is created for each test. Tests are specially marked to be excluded from the production application and are added to a test schedule. When the project is built in the testing configuration, the tests are generated and executed and results are logged to the console and output as a set of JSON files in the test results directory.

We are using GitHub Actions to automatically build and run tests for each domain and the system deployment any time a release is drafted. The output from each test schedule is consolidated and formatted into an HTML report.

Domain Overview

Protocol Verifier Domains

Job Management (JM)

Job Management makes use of the worker pattern allowing PV instances to be started without the use of fixed lanes. Jobs shall be assigned by the Job Management domain to PV workers using a round robin technique.

Each PV worker will have to maintain a heartbeat with Job Management, and loss of the heartbeat will cause the worker to be retired. Any in-progress jobs assigned to a PV worker who has retired shall become unassigned and added to the queue of jobs that need processing.

File Reception (FR)

The File Reception Domain (FR) takes JSON format event streams as its input. Some configurations of the PV will take JSON event streams as direct input say from a messaging fabric. Other configurations of the PV can receive events streams in a file-based format. Each file contains a number of JSON formatted events. This domain deals with the file-based reception of audit event files. It is not required in configurations where the JSON events are provided directly over a messaging fabric.

Ordering (AEO)

The role of the Audit Event Ordering domain is two-fold. It validates audit event fields for reasonable and legal values, and it constructs the audit event sequence into the correct order as determined by the previous event ID in each Audit Event. Once the audit events have been correctly ordered they are delivered to the Sequence Verification domain. Events from unexpected sources are rejected and errors are notified. Audit Event Ordering waits for out of sequence events to arrive for a defined period of time. Gaps in event sequences not resolved within the defined time period are denoted as a failure of the Job and the error condition is notified.

Another role of Audit Event Ordering is to read a configuration file at initialisation, to use that data to set up its own definition classes, and to forward that configuration information to Sequence Verification to set up its definition classes. This approach ensures that the definition classes of Audit Event Ordering and Sequence Verification are aligned.

Sequence Verification (SVDC)

The role of the Sequence Verification domain is to verify that the audit events received are in a correct, expected order taking account of support for repeated audit event types and forks, parallel branches and merges in the event sequences. A Job is only deemed complete when all sequences within the Job have completed. The Sequence Verification domain is built to detect and report a number of error conditions in the received Audit Event data. These error conditions include unexpected audit event types, unexpected sequences of audit event types, sequences starting with the wrong audit event types and repetition of audit event types in unexpected places.

Invariant Store (IStore)

The role of the Invariant Store domain is to provide persistent storage of extra job invariants which will typically live longer than any single job. The Invariant Store serialises access to persisted invariants across multiple concurrent instances of Protocol Verifier processes.

For a visual overview of the domains and the key interactions between them click here.

Verification Gateway (VG)

The role of the Verification Gateway is to generate audit events that represent the protocol of job verification being preformed by the Protocol Verifier. These audit events can then be injected into another (or the same) instance of the PV and used to verify that the PV itself is behaving as expected. This means that the PV is both observing and observable. This provides assurance that the PV is behaving correctly and also serves to demonstrate the both the generation and consumption of audit events. The PV can be said to be "eating its own dog food".

Deployment Overview

The Protocol Verifier is partitioned into a set of deployments that can be instantiated and deployed in parallel to provide for performance scaling. The deployments are made up of one or more of the Protocol Verifier domains and some supporting infrastructure. Deployments can use file-based messaging between domains but the preferred option is based upon a messaging infrastructure. The following shows the components for a deployment using Kafka for messaging:

  • JobManagement (JM) - typically a single instance

  • IStore - single instance

  • Ordering, Sequence Verification, Verification Gateway (AEO_SVDC) - multiple instances

  • zookeeper - single instance

  • kafka - single instance

For a description of how this architecture achieves scaling see here.

Supporting Application Overview

plus2json

plus2json is an application that converts the PLUS language into JSON files that the Protocol Verifier can consume. Since the Protocol Verifier is data driven, the use of plus2json is essential to set up a new instance of the Protocol Verifier to monitor and check a new protocol. Once configured and supplied with a set of job definitions, the Protocol Verifier can be run without reconfiguration for as long as the input set of job definitions need to be monitored.

plus2json is also capable of providing test audit event streams. The regression testing of the PV uses plus2json to generate valid and invalid event streams.

See the plus2json git repository for details of plus2json commands.

Configuring the Protocol Verifier

This splits into the following main parts:

  • Specifying Job and Event Data Definitions

  • Configuring Runtime Parameters

  • Configuring the messaging fabric

For a visual overview of the configuration of the Protocol Verifier click here.

Specifying Job and Event Data Definitions

Each protocol monitored by the Protocol Verifier needs a Job Definition which specifies the behaviour (protocol) of a particular monitored task. Job definitions are defined in the PLUS language and edited/visualised with PlantUML.

The Protocol Verifier is data driven. It has no built-in knowledge of any particular protocol. Prior to running the Protocol Verifier for the first time, a runtime configuration file and job definition files need to be in place for the protocols being monitored. After this has been done initially, it needs to be repeated only to add or change the definitions of monitored jobs. The configuration for the protocol verifier is found in the deploy directory. This is loaded at start up and checked on a periodic basis for updates.

The structure of the deploy directory is shown here.

PLUS Job Definitions

Use PLUS to define jobs and produce runtime configuration.

Refer to here for details on the use of PLUS.

plus2json: PLUS Conversion and configuring the Protocol Verifier

Use plus2json to convert PLUS into job definitions and runtime configuration files.

The plus2json application takes as input PLUS (.puml) text files and produces a number of possible outputs. The primary output is the JSON-formatted job definition for a particular protocol. plus2json also can produce runtime configuration files in a format that the Protocol Verifier can consume.

For details on using plus2json and its options refer to plus2json.

The commands described below produce a single config.json file which contains a number of configuration parameters and a list of the job specification names together with individual job configuration parameters. In addition they generate a file for each job definition and the event definitions relevant to that job definition. Further, if the job definition requires any supplementary audit event data then an additional configuration file defining the audit event data definitions for that job definition can be created.

Currently, PLUS files should each contain a single job definition.

Assume a PlantUML file containing a single job definition called Tutorial_1.puml has been created. The typical sequence of plus2json commands is as follows:

Optionally backup the existing configuration file by moving the contents of this directory to a backup location of your choice.

The following commands convert the PLUS files into JSON files and load them into the appropriate configuration directories:

  1. python plus2json.pyz Tutorial_1.puml

    • this checks the syntax of the puml file

  2. python plus2json.pyz Tutorial_1.puml --job -p

    • this produces a human readable representation of the job definition including previous events and audit event data

  3. python plus2json.pyz Tutorial_1.puml --job | python -m json.tool > munin/deploy/config/job_definitions/tutorial_1.json

    • this generates the job definition file and loads it into the appropriate directory. The job definition file must be the same name as the job definition name

Configuring Runtime Parameters

Note: This is regarded as an advanced feature - the behavioural parameters in the config.json file can be adjusted with appropriate caution.

Click to see more details on the deploy/config/pv-config.json config file format

This configuration file contains some items that may be adjusted to manage the digital twin. The following is a list of the configuration items that can be adjusted and there description are as follows:

SpecUpdateRate - A time period that determines how often the application reloads the configuration files.

IncomingDirectory - The directory where the application expects to find JSON files containing events.

ProcessedDirectory - The directory where the application moves JSON files after all the contained events are processed.

ReceptionDeletionTime - When a file has been through reception the details of the reception processing shall be stored until this time expires. Only applicable to legacy file-based incoming data.

ConcurrentReceptionLimit - A number that indicates the limit of concurrent reception jobs that can be executing, e.g. 1 = one active reception job

SchemaValidate - A boolean which if set true ensures that json schema validation of received files or messages is performed.

SchemaValidateFrequency - TODO CHECK How often schema validation is run in seconds

FileControlWaitTime - TODO

MaxOutOfSequenceEvents - This is the consecutive maximum out of sequence events that can be received for a job before an error is declared.

MaximumJobTime - This is the maximum time it should take for a job to be finished. When this time has been reached after the job was started it shall be archived if there are no blocked events or failed if there are blocked events.

JobCompletePeriod - When a Job has completed it shall be either archived or failed and once the job complete period has expired it shall be deleted from the domain with all associated events.

JobDefinitionDirectory - The file location relative to the deploy directory where the job definition json files are stored.

DefaultJobExpiryDuration - TODO CHECK - This defines the default period of validity for a job definition

DefaultStaleAuditEventDuration - This is the period of time for which a received event is deemed to be valid after its stated auditEventTime. If the event is received outside of this valid period then the audit event is considered to have failed.

DefaultBlockedAuditEventDuration - This is the period of time for which a job will wait for an expected event (as indicated by its previous event id on earlier event) to arrive. If the expected event fails to arrive within this period then the job will be failed.

JobStoreLocation - This contains the relative path to the directory where the Job Store Id file will be created and maintained.

JobStoreAgeLimit - This defines how long the job ids will be retained in the job store.

InvariantStoreLoadRate - This defines how frequently the invariant store is checked for changes. The detection of changes will prompt the upload of the new invariants to each running instance of the AEO_SVDC process.

MaxIntraSequenceEventTimeoutPeriod - Under some circumstances it is not possible to be sure that a job has finished. The conditions for job completion may be present but further events might still be possible. This timer determines how long the job should wait for any late arrival events before declaring itself complete.

WaitPeriodForAllJobsCompletedCheck - TODO

WaitPeriodForInvariantDeletion - TODO

TimeoutPeriodForRetreivingStoredInvariants - When loading invariants, a warning will be issued if this time period expires. Jobs depending upon the loaded invariants will fail.

TimeoutPeriodForHangingJob - This determines how long the PV will wait for events in the circumstances where the PV knows the job is incomplete and further events are expected. If this timer times out, then the job is deemed to have failed.

Example:

{
  "SpecUpdateRate": "PT2M",
  "IncomingDirectory": "./incoming",
  "ProcessedDirectory": "./processed",
  "ReceptionDeletionTime": "PT0S",
  "ConcurrentReceptionLimit": "1",
  "SchemaValidate": "true",
  "SchemaValidateFrequency": "1",
  "FileControlWaitTime": "PT1S",
  "MaxOutOfSequenceEvents": "1000",
  "MaximumJobTime": "PT10M",
  "JobCompletePeriod": "PT24H",
  "JobDefinitionDirectory": "config/job_definitions",
  "DefaultJobExpiryDuration": "P99W",
  "DefaultStaleAuditEventDuration": "PT10M",
  "DefaultBlockedAuditEventDuration": "PT55S",
  "JobStoreLocation": "./JobIdStore",
  "JobStoreAgeLimit": "PT1H",
  "InvariantStoreLoadRate": "PT2M",
  "MaxIntraSequenceEventTimeoutPeriod": "PT5S",
  "WaitPeriodForAllJobsCompletedCheck": "P1D",
  "WaitPeriodForJobDeletion": "PT0S",
  "WaitPeriodForInvariantDeletion": "P1D",
  "TimeoutPeriodForRetreivingStoredInvariants": "PT10S",
  "TimeoutPeriodForHangingJob": "PT30S"
}

The selection of timer values can have a significant impact on the performance of the PV. For example, the MaxIntraSequenceEventTimeoutPeriod defines how long jobs will wait for any additional events. If this timer were set to 10 minutes and the PV is running at 500 events/sec then 300000 events will be retained in the PV after the potential completion of their associated jobs. This will utilise memory and decrease performance.

The values shown in the example reflect values that have worked effectively in performance tests.

Configuring the messaging fabric

There are a variety of possible applications that could provide the messaging fabric. The chosen one for this release is Kafka. However, it is important to note that none of the PV domains have any explicit dependency on Kafka. Changing to a different messaging fabric would have some impact would in would not be major.

The configuration of Kafka is specified in deploy/docker-compose.kafka.yml

This determines the number of instances of each of the deployments to instantiate at start up. Note: This version of the Protocol Verifier supports static scaling. This yml script also launches the logging service and the kafka and associated zookeeper processes.

Building and Running the Protocol Verifier

Build Overview

As mentioned in the section discussing the toolchain, there are three major steps to building and running the projects:

  1. Export MASL

  2. Build with Docker

  3. Launch with Docker

Before getting into the actual build, it is often an instructive process to go through the project structure file by file and explore the purpose of each file in the context of the build. We will use the JobManagement domain for this. Each of the other domains follows a similar pattern. Not every file/directory seen here will exist in each domain project.

For a visual overview of the top-level of the project directory structure click here.

Note
Some files are marked by git as "ignored" these tend to be generated byproducts of the build that should not be committed to the repository (e.g. build logs, test results). Not every one of these files will be covered in the section below, but it is good to be aware of them.
▾ JobManagement/
  ▸ build/
  ▸ config/
  ▾ gen/
    ▸ code_generation/
      application.mark
      features.mark
      README.adoc
  ▸ masl/
  ▸ models/
  ▸ schedule/
  ▸ schema/
  ▸ test_results/
  ▸ testing/
    .dependencies
    .project
    conanfile.py
    test.env

For a visual overview of the directory structure for the JobManagement domain click here. TODO Update the diagram

build

The build directory contains artefacts produced in the code generation process.

config

The config directory contains plaintext files used by the application itself to configure the domain. The application is passed a config file as a command line argument, which it parses and uses to set up the initial instance population. Not all projects have config folders.

gen/

The gen directory contains files used during the process of code generation and build. features.mark and application.mark contain model compiler "marks". These metadata are associated with particular application model elements and act as directives to the compiler. For example, domain services used exclusively for testing are marked as test_only, and the architecture will exclude them from generation during a production build.

masl/

The masl directory is the output location for exported MASL text. When the project is clean, this directory is empty. The files in this directory are generated and should not be hand edited.

models/

The models directory is where BridgePoint stores xtUML source model files. The files in this directory are managed by BridgePoint and should not be hand edited.

schedule/

The schedule directory contains plaintext files used by the architecture for startup and testing. The MASL C++ platform provides a mechanism to run domain services externally using a schedule file. This mechanism is particularly useful for setting up execution of a particular set of tests, however it can also be leveraged to determine which services will run at different stages of initialisation.

TODO Should we mention SKIP and PAUSE commands here?

schema/

The schema directory holds the json schemas relevant to the domain. Not all domains will have a schema directory

test_results/

The test_results directory is created during a test execution and contains JSON files containing the results and details of executed tests. This directory is created by the execution of the unit tests. The files should not be hand edited and this directory may not exist before a run.

testing/

The testing directory contains test files used in the unit tests.

.dependencies

TODO

.project

TODO This files contains some Bridgepoint configuration parameters for the domain.

conanfile.py

TODO

test.env

TODO

Populate the Conan Server

The build process uses Conan and the Conan server local cache needs to be populated with the required artefacts before the build can commence. This step is only required once before building.

TODO When does it need to be re-done. When you change branch?

From the root of the repository, execute the following command:

./bin/populate.sh

This may take some time as it downloads ~500MB of data.

Export MASL

There is a script to help with this that works on MacOS. However, it doesn’t work on Windows so use the manual approach described below.

Generate MASL using script

To generate MASL, import the project(s) you desire to build into a BridgePoint workspace and export MASL by clicking the hammer icon or selecting "Build Project" from the context menu.

You can generate MASL for all projects by navigating to the models/ directory and executing the following command:

./gen_all.sh

If <BP install>/tools/mc/bin is in your PATH, the xtuml2masl script will be executed. If not, a docker container will be used to generate the MASL.

Generate MASL manually

  1. Open up BridgePoint. Assure that you have all of the source projects imported into your workspace.

    See the list of projects.

  2. To export MASL, select each project and click the hammer icon found in the tool ribbon at the top of the screen.

  3. Alternatively you can right click each project and select "Build Project" from the context menu.

  4. If you wish to export MASL for all projects at once, you can click "Build All" from the "Project" menu in the application bar at the top of the application or use the Ctrl-B/Cmd-B keyboard shortcut.

Note
The utils project simply contains common MASL interfaces and need not be built. In fact, it will not even show up in the xtUML Modelling perspective.

Build and Test scripts

A number of shell scripts have been written to build and execute the PV for use in various circumstances. Some of them are illustrated here:

  • build_all.sh

  • regression.sh

  • run_benchmark.sh

build_all.sh

To build all projects, including the deployment, navigate to the models/ directory and execute:

./build_all.sh

All build artefacts including C++ source code can be found in the build/ directory in each project.

regression.sh

This is the principal regression test script. There are several other variants of it which test error conditions bit they all follow the same principle. It can be found in:

~/git/munin/tests   # linux/mac
C:\git\munin\tests  # windows

The script assumes that all of the domains have been exported from BridgePoint. It initially cleans up the deploy directory. It then generates the set of job definition files required for the regression test using plus2json --job to convert puml files into json files that can be consumed by the PV.

It then launches the PV application and uses the same regression job definition files to generate test event streams using plus2json --play.

Finally, it checks the log files to ensure that all jobs have passed.

Variants of regression.sh follow the same basic steps but typically vary the --play options to inject errors. The test of the log files ensures that the test jobs fail as expected.

run_benchmark.sh

This is the principal performance benchmarking script. It can be found in:

~/git/munin/metrics   # linux/mac
C:\git\munin\metrics  # windows

The script assumes that all of the domains have been exported from BridgePoint. It initially cleans up the deploy directory. It then generates the set of job definition files required for performance benchmarking using plus2json --job to convert puml files into json files that can be consumed by the PV.

It then launches the PV application and uses the same set of job definition files to generate test event streams using plus2json --play. Options on --play allow thousands of jobs to be injected into the PV in a single test. The rate of event generation and whether the events are shuffled or not can be specified by options.

Finally, it analyses the log files, characterises the performance and summarises the passes and failures.

Building individual domains

To build a single project, navigate to the project directory and execute the command:

../../bin/build.sh

If the project had been built previously, this command will perform an incremental build and should not have to recompile all the sources. It is recommended that a full build be performed first and then incremental builds during development. Note that the build command does not cause MASL to be re-exported. That must be done manually as a separate step.

Note
The deployment must be rebuilt if any domain is changed in order for changes in the domain to manifest in the application. It is recommended to always run ./build_all.sh after making changes in a domain but before running the integrated application.

Cleaning

Sometimes artefacts that are no longer required are left in the build and need to be cleaned out. If there are build issues then a clean is recommended. Artefacts generated by the build can be removed by navigating to the project directory and executing the command:

../../bin/clean.sh

All projects can be cleaned by navigating to the models/ directory and executing the command:

./clean_all.sh

Launching The Application

Once the entire system has successfully built, the application can be launched by navigating to the deploy/ directory. To execute a file-based version of the PV execute the following command:

docker compose up

To execute a kafka-based version of the PV on Windows or Linux execute the following command:

docker compose -f docker-compose.kafka.yml up

For MacOS execute the following:

docker compose -f docker-compose.kafka.yml.macos up

TODO CHECK A graphical rendition of the deployment build process is depicted here.

The munin/deploy Folder

It is useful but not essential to have a good understanding of the structure

of the deploy folder. Click to see more details on the deploy folder

Details

This section describes the purpose of the folders and files within the munin/deploy folder.

TODO Fix the following section

▾ deploy/
  ▾ config/
    ▸ job_definitions/
  ▸ InvariantStore/
  ▸ JobIdStore/
  ▸ logs/
  ▸ reception-processed/
    docker-compose.yml
    docker-compose.kafka.yml
    docker-compose.kafka.yml.macos
    p2JInvariantStore
    .env

config/job_definitions/

This contains files that capture the job specifications that are too be used by the PV. Typically these will have been generated using plus2json.

reception-processed/

This is the directory where AER shall place Audit Event files that it has processed in this directory. This is unused if a configuration with a messaging fabric is used.

InvariantStore

Extra Job Invariants have to be persisted beyond the lifetime of the executing processes. This is directory contains the repository for persisted invariant values.

logs

This directory contains log files generated by the reception and verifier parts of the PV.

docker-compose.yml

This is the docker compose file that is used to build the version of the PV that achieves inter-process messaging using files.

docker-compose.kafka.yml

This is the docker compose file that is used to build the version of the PV that achieves inter-process messaging using kafka. This default version of the docker compose file supports Linux and Windows builds.

p2jInvariantStore

Just as the PV needs to retain extra job invariants over extended periods of time so does plus2json. A plus2json job which uses an extra job invariant hours or days after it was created needs to be able to locate and reuse the appropriate value. This file is the store of invariants that plus2json uses.

.env

This file defines the version of MASL to be used.

Stopping the Application

  1. Kill the process by pressing Ctrl-C. Clean up the process by running the following command:

    docker compose down

or

docker compose -f docker-compose.kafka.yml down

Troubleshooting Docker

Docker is a great tool for standardising builds and deployments, however it presents some pitfalls when being used as a local build/development tool.

Docker Compose requires the "down" command to be issued even after all the processes launched by the "up" command have terminated. This is because, though the process inside each container has exited, the container itself still exists and can be restarted. As long as the container exists (whether running or stopped), it will hold onto resources such as shared volumes and internet ports. The "down" command tells Docker Compose to remove all the containers associated with the launch.

If you see the message "port is already allocated", it is likely that you forgot to run the docker compose down command somewhere along the way. When you run this command, make sure it matches the "up" command (e.g. if you run docker compose -f docker-compose.test.yml up to start the application, you should run docker compose -f docker-compose.test.yml down in the same directory to tear it down.)

If there are only two commands to remember from this section, they are:

docker system prune
docker volume prune

The first command causes Docker to remove all stopped containers, networks, dangling images and build cache. This usually works to give a "fresh" start if you get stuck. The second command removes data volumes (disk storage) created while the system is running. The Kafka message broker creates large storage volumes that need to be cleaned up periodically.

If you are making changes but not observing different behaviour check the following:

  1. Assure you have re-exported MASL (build projects from within BridgePoint)

  2. Run the build again with caching disabled: docker compose build --no-cache

  3. Run the "up" command with the --force-recreate flag: docker compose up --force-recreate (this flag forces existing containers to be replaced with new ones created from the latest image).

Running the latest published version of the protocol verifier

As mentioned above, the application is built and published automatically (to the GitHub Container Registry) each time a new release is drafted. It is possible to use docker to run the latest version of the application without any build at all.

  1. Execute the application by running the following commands from the munin/deploy directory:

docker compose -f docker-compose.prod.yml up
  1. Kill the process by pressing Ctrl-C.

Understand that job definitions need to be supplied, and audit event stream JSON is expected on the Kafka topic: Protocol_Verifier_Reception.

Building a domain with test

To build a project with tests enabled, navigate to the project directory and execute the command:

../../bin/build.sh test

Once the build is complete, run the tests by executing the command:

../../bin/test.sh

A success or failure message is printed at the end of the test run.

Building with inspector

TODO Do we want to say anything about this event if just for our own benefit?