Skip to content

Commit

Permalink
Refact: Separation of anomaly detection from drift detection #1133
Browse files Browse the repository at this point in the history
  • Loading branch information
detlefarend committed Feb 13, 2025
1 parent 151a9bd commit e1730f8
Show file tree
Hide file tree
Showing 82 changed files with 1,114 additions and 702 deletions.
9 changes: 6 additions & 3 deletions doc/rtd/content/03_machine_learning/mlpro_oa/main.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,12 @@
MLPro-OA - Online Adaptivity
============================

This framework addresses topics of online machine learning and is already implemented and ready to be used by early adopters. The
documentation is still in preparation but the API description is already done. Just browse through the menu structure and follow
the links provided...
This framework addresses the challenge of continuously adapting to changing conditions by processing new information in real time. Unlike traditional offline approaches,
which rely on predefined models trained on historical data, online-adaptive methods dynamically refine their behavior as new data becomes available. This enables
continuous learning, rapid adaptation to non-stationary environments, and increased robustness in uncertain or evolving scenarios. Such methods are essential in domains
requiring real-time decision-making and continuous model updates, including autonomous systems, predictive analytics, and self-optimizing processes.

The following subfields are available in MLPro-OA:

.. toctree::
:maxdepth: 2
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
.. _target_oa_streams:
Online-adaptive data stream processing (OADSP)
==============================================

This sub-framework of MLPro-OA is directly related to the research topic of online machine learning (OML). It deals with
online-adaptive stream tasks embedded in a stream workflow as part of an extended process model. This process model extends
the non-adaptive DSP process model of sub-framework MLPro-BF-Streams by advanced adaptation mechanisms like

- Event-oriented adaptation
- Cascaded adaptation
- Reverse adaptation

which are explained in more detail below.

The description is still under contruction but parts of the sub-menu and first howtos and API specifications are already available.
Browse the menu and see Section 'Cross reference' for further details.


**Learn more**

.. toctree::
:maxdepth: 3
:glob:

streams/*


**Cross reference**
- :ref:`Howtos MPro-OA <target_appendix1_OA>`
- :ref:`API reference: MLPro-OA-Streams <target_api_oa_streams>`
- :ref:`API reference: MLPro-OA-Streams - Pool of objects <target_api_pool_oa_streams>`
- :ref:`Basics of data stream processing in MLPro <target_bf_streams>`

This file was deleted.

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
.. _target_oa_systems:
Online Adaptive Systems
=======================
Online-adaptive state-based systems
===================================

Further descriptions coming soon...

.. toctree::
:maxdepth: 2
:glob:

layer1_oa_systems/*
systems/*
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_oa_stream_tasks:
Online Adaptive Stream Tasks
Online-adaptive stream tasks
============================

...
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_oa_stream_workflows:
Online Adaptive Stream Workflows
Online-adaptive stream workflows
================================


Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_oa_boundary_detector:
Boundary Detection
Boundary detection
==================

Further descriptions coming soon...
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_oa_cluster_analyzer:
Cluster Analysis
Cluster analysis
================

Further descriptions coming soon...
Expand Down
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
.. _target_oa_anomaly_detection:
Anomaly Detection
Anomaly detection
=================

Anomaly detection involves identifying instances that are structurally or dimensionally similar to non-anomalous data but deviate significantly from the normal data distribution or pattern. In real-world problems, anomaly detection helps uncover unusual activities in banking and finance, abnormalities in medical test results, uncommon behavior or sensor readings of machines, defective products in manufacturing lines, or malicious activities in network traffic monitoring. Detecting and analyzing these instances or behaviors is crucial for taking immediate action, preventing future occurrences of undesirable events, and ensuring data quality. Moreover, anomaly detection plays a pivotal role in making unbiased and accurate decisions across various domains.

Anomaly detection techniques can be broadly classified into two categories based on the under- lying principles and methodologies. The two categories are Statistical anomaly detectors and Machine Learning anomaly detectors.

**Types of Anomalies**
**Types of anomalies**
There are three main types of anomalies- Point anomalies, Contextual anomalies and Collective anomalies.

(a) Point Anomalies : Type I anomalies or point anomalies are individual data instances that are significantly different from the rest of the dataset. Also known as global anomalies, these do not fit the normal distribution or pattern of the dataset.
(b) Contextual Anomalies : Type II anomalies or contextual anomalies are data instances that are anomalies only in a particular context or subset of the dataset. Also known as conditional anomalies, these are not necessarily anomalies in the context of the whole dataset but anomalous within a specific context or condition.
(c) Group Anomalies : Type III anomalies or group anomalies or collective anomalies are anomalous data instances when taken as a group or subset of the dataset. They may or may not be anomalies when considered individually. Also known as group anomalies, these occur when there is a deviation or unexpected relationship or behaviour among a group of data instances from the normal distribution of data.

**Classification of Anomaly Detectors**
**Classification of anomaly detectors**
Anomaly detection techniques can be broadly classified into two categories based on the under- lying principles and methodologies. The two categories are Statistical anomaly detectors and Machine Learning anomaly detectors.

(a) Statistical Anomaly Detectors : Statistical anomaly detectors use statistical methods to find data point deviations from the normal distribution. Common algorithms for this category of anomaly detectors are Z-score, Kernel Density Estimate, and Gaussian Mixture Models (GMM).
Expand All @@ -29,7 +29,7 @@ Anomaly detection techniques can be broadly classified into two categories based
20_anomaly_detection/*


**Cross Reference**
**Cross reference**

- Selected open access papers
- Howtos
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_oa_cbad:
Cluster-based Anomaly Detection
Cluster-based anomaly detection
===============================

Cluster-based anomaly detection uses clustering algorithms to form clustered data and identify anomolous behaviours of these clusters, which are flagged as anomalies.
Expand All @@ -18,7 +18,8 @@ Can uncover contextual anomalies (data points that are anomalous in one cluster
Can identify global anomalies (points far from all clusters) and local anomalies (outliers within a cluster).
Scalability with various clustering algorithms, such as k-means for simpler scenarios or DBSCAN for non-spherical and dense data distributions.

**New Types of Anomalies**

**New types of anomalies**

Cluster-based methods introduce nuanced anomaly categorizations:

Expand All @@ -28,9 +29,10 @@ Scalability with various clustering algorithms, such as k-means for simpler scen

(c) Cluster Structural Anomalies: Unusual clusters themselves, such as unexpected densities, shapes, or sizes, signaling broader irregularities.

**Special Dependencies on Cluster Algorithms**

Cluster-based anomaly detection heavily depends on the choice of clustering algorithm, as it directly impacts the detection process:
**Special dependencies on cluster algorithms**

Cluster-based anomaly detection heavily depends on the choice of clustering algorithm, as it directly impacts the detection process:

k-means: Effective for spherical clusters but may miss anomalies in datasets with non-convex shapes or varying densities.
DBSCAN: Ideal for discovering density-based anomalies but sensitive to hyperparameter tuning (e.g., minPts, ε).
Expand All @@ -39,7 +41,8 @@ Gaussian Mixture Models (GMM): Handles soft clustering and detects probabilistic
Spectral Clustering: Good for identifying anomalies in non-linear manifolds but computationally intensive for large datasets.
Effective anomaly detection requires understanding the clustering algorithm’s limitations and ensuring it aligns with the data characteristics and problem context.

**Cross Reference**

**Cross reference**

- Howtos
- API
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_oa_anomaly_prediction:
Anomaly Prediction
Anomaly prediction
==================

Further descriptions coming soon...
Expand All @@ -11,10 +11,10 @@ Further descriptions coming soon...
:maxdepth: 2
:glob:

20_anomaly_prediction/*
21_anomaly_prediction/*


**Cross Reference**
**Cross reference**

- Selected open access papers
- Howtos
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.. _target_oa_drift_detection:
Drift detection
===============

Further descriptions coming soon...


**Learn more**

.. toctree::
:maxdepth: 2
:glob:

30_drift_detection/*


**Cross reference**

- Selected open access papers
- Howtos
- API
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.. _target_oa_cbdd:
Cluster-based drift detection
=============================

Further descriptions coming soon...


**New types of drift**

...


**Special dependencies on cluster algorithms**

...


**Cross reference**

- Howtos
- API
Empty file.
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Online Adaptive Stream Processing
=================================
Online-adaptive data stream processing
======================================

.. toctree::
:maxdepth: 1
:glob:

oa_dsp/*
streams/*

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
.. _Howto_OA_PP_121:
Howto OA-PP-121: Complex preprocessing with parallel tasks
==========================================================


**Executable code**

.. literalinclude:: ../../../../../../../../test/howtos/oa/streams/hybrid/howto_oa_streams_pp_121_complex_preprocessing.py
:language: python



**Results**

After starting the Howto, the Workflow and Tasks windows appear. These can now be arranged on
the screen before the actual processing is started with a keystroke...

.. image::
images/howto_oa_pp_121.gif
:width: 700px



**Cross reference**

- :ref:`API Reference - Online-adaptive data stream processing <target_api_oa_streams>`
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _target_api_oa_streams:
OA Stream Processing
====================
Online-adaptive data stream processing
======================================

.. image:: streams/images/MLPro-OA-Stream-Processing_class_diagram.drawio.png
:scale: 50%
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _target_api_pool_oa_streams:
OA Stream Processing
====================
Online-adaptive data stream processing
======================================

.. toctree::
:maxdepth: 2
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _target_api_oa_streams_tasks:
OA Stream Tasks
===============
Online-adaptive stream tasks
============================

.. toctree::
:maxdepth: 2
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
.. _target_api_oa_stream_tasks_prepro:
OA Data Preprocessing
=====================
Preprocessing
=============

.. image:: images/MLPro-OA-Preprocessing-Tasks_class_diagram.drawio.png
:scale: 50%


Boundary Detector
Boundary detector
-----------------

.. automodule:: mlpro.oa.streams.tasks.boundarydetectors
Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
.. _target_api_oa_stream_tasks_clu:
Cluster Analysis
Cluster analyzer
================

.. image:: images/MLPro-OA-Cluster_Analyzers_class_diagram.drawio.png
:scale: 50%


Template for Cluster Algorithms
Template for cluster algorithms
-------------------------------

.. automodule:: mlpro.oa.streams.tasks.clusteranalyzers.basics
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. _target_api_oa_stream_tasks_ad:
Anomaly Detectors
Anomaly detectors
=================

.. image:: anomaly_detectors/images/MLPro-OA-Anomaly-Detectors_class_diagram.drawio.png
Expand Down
Loading

0 comments on commit e1730f8

Please sign in to comment.