[DOC] Anomaly detection module: Add informational tags back to estimator docs #2490

inclinedadarsh · 2025-01-12T13:32:33Z

Context

The anomaly detection estimators documented their input data format, output data format, and learning type in a manually maintained capabilities table in the estimator docs.
In #2468, we discussed the need of removing the manually added capabilities table in the Anomaly Detection module in favor of the automatically generated capabilities table from the estimators' tags (#2468 (comment)). The information about the output data format and learning type is now missing. We should add it back.

Please checkout #1430 & #2468 for more information.

Prerequisites

#1430 & #2468

Details

The following information is now missing in the docs:

CBLOF
- Output data format: anomaly scores
- Learning type: unsupervised or semi-supervised
COPOD

For COPOD, there is no table in the documentations itself, however, while looking at the code base there seems to be a table, which isn't present in the website for some reason. This information was taken from the codebase directly.
- Output data format: anomaly scores
- Learning type: unsupervised or semi-supervised
DWT_MLEAD
- Output data format: anomaly scores
- Learning type: unsupervised
IsolationForest
- Output data format: anomaly scores
- Learning type: unsupervised or semi-supervised
KMeansAD
- Output data format: anomaly scores
- Learning type: unsupervised or semi-supervised
LOF
- Output data format: anomaly scores
- Learning type: unsupervised or semi-supervised
LeftSTAMPi
- Output data format: anomaly scores
- Learning type: unsupervised
MERLIN
- Output data format: binary classification
- Learning type: unsupervised
OneClassSVM
- Output data format: anomaly scores
- Learning type: semi-supervised
PyODAdapter
- Output data format: anomaly scores
- Learning type: unsupervised or semi-supervised
STOMP
- Output data format: anomaly scores
- Learning type: unsupervised
STRAY
- Output data format: binary classification
- Learning type: unsupervised

inclinedadarsh · 2025-01-12T13:35:19Z

I'd love to work on this too. Please let me know how can I move forward with it.

SebastianSchmidl · 2025-01-12T15:11:52Z

We have not decided on a good way to do this yet. I see the following options:

Add the information manually to the docs. ❌ I think that's a bad idea.
Remove it. ❌ Also a non-option for me!
Add the information to the estimator tags. ✅ I think that's the better approach because it re-uses the new documentation utilities and would also easily transfer to other modules. However, we do not have a standard way to do this currently.

We need to agree on a good way to represent the information in the current tags-infrastructure. How do other modules handle this?

I already described my proposal in #2468 (comment):

_tags = {
    [...]
    "capability:unsupervised": True,
    "capability:semi-supervised": False,
    "capability:supervised": False,
    "output_format: "anomaly scores"
}

@baraline, @MatthewMiddlehurst what do you think? Does this conflict with other modules? Should we add namespaces (e.g., "ad:output_format") to module-specific tags?

MatthewMiddlehurst · 2025-01-12T20:57:26Z

As mentioned in the PR, I think tags is a fine solution to this. The discussion on my end is more surrounding how we should name the tags, but I do not think that is essential for these to be implemented. The content of the tags above seem fine to me, though it may be worth discussion if it's worth turning "Learning type" into three distinct tags also.

inclinedadarsh · 2025-01-19T17:05:45Z

@SebastianSchmidl if this is accepted, then should I continue adding this information back?

SebastianSchmidl · 2025-01-19T20:43:29Z

I will take this into our next dev meeting to discuss with other module owners on a naming-scheme. Once, we have agreed to one, I'll come back to you.

inclinedadarsh · 2025-01-20T03:13:48Z

Sure, thanks for the response!

inclinedadarsh mentioned this issue Jan 12, 2025

[ENH] Add sphinx event to add capability table to estimators' docs individually #2468

Merged

2 tasks

SebastianSchmidl changed the title ~~[DOC] Add output format and learning type information in classes in the anomaly detection module~~ [DOC] Anomaly detection module: Add informational tags back to estimator docs Jan 12, 2025

SebastianSchmidl added documentation Improvements or additions to documentation maintenance Continuous integration, unit testing & package distribution anomaly detection Anomaly detection package labels Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] Anomaly detection module: Add informational tags back to estimator docs #2490

[DOC] Anomaly detection module: Add informational tags back to estimator docs #2490

inclinedadarsh commented Jan 12, 2025 •

edited by SebastianSchmidl

Loading

inclinedadarsh commented Jan 12, 2025

SebastianSchmidl commented Jan 12, 2025

MatthewMiddlehurst commented Jan 12, 2025

inclinedadarsh commented Jan 19, 2025

SebastianSchmidl commented Jan 19, 2025 •

edited

Loading

inclinedadarsh commented Jan 20, 2025

[DOC] Anomaly detection module: Add informational tags back to estimator docs #2490

[DOC] Anomaly detection module: Add informational tags back to estimator docs #2490

Comments

inclinedadarsh commented Jan 12, 2025 • edited by SebastianSchmidl Loading

Context

Prerequisites

Details

inclinedadarsh commented Jan 12, 2025

SebastianSchmidl commented Jan 12, 2025

MatthewMiddlehurst commented Jan 12, 2025

inclinedadarsh commented Jan 19, 2025

SebastianSchmidl commented Jan 19, 2025 • edited Loading

inclinedadarsh commented Jan 20, 2025

inclinedadarsh commented Jan 12, 2025 •

edited by SebastianSchmidl

Loading

SebastianSchmidl commented Jan 19, 2025 •

edited

Loading