Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Automate reference documentation as YAML files #1610

Open
theletterf opened this issue Jul 26, 2023 · 7 comments
Open

[Proposal] Automate reference documentation as YAML files #1610

theletterf opened this issue Jul 26, 2023 · 7 comments

Comments

@theletterf
Copy link
Member

I originally raised this proposal in open-telemetry/opentelemetry-collector-contrib#24189, but after discussing this with @svrnm I think it'd apply to the wider OTel community:

Reference documentation for each component is hard to come by, update, and produce. Settings and metrics are, by far, the user-facing elements that might change more often between releases. This poses significant overhead on anyone trying to keep documentation up-to-date both upstream and downstream.

An example of overhead is open-telemetry/opentelemetry-java-instrumentation#4981, where I added settings docs manually after scavenging for settings using grep. Clearly not ideal if we want to keep documentation up to date.

A solution to this update overhead could be to automate the generation of reference docs into structured data files, so that those files can be parsed and rendered into documentation, both upstream and downstream, effectively becoming a reliable single source of truth for OTel settings, metrics, etc. By reference docs I mean the following:

  • Metrics, for example Collector receivers's OOTB metrics
  • Configuration settings
  • Component metadata (maturity, distros, etc.)

I believe the Collector repo is already using mdatagen for some things. I've also seen a spike in YAML usage for docs generation in other repos, like the Registry's.

@svrnm
Copy link
Member

svrnm commented Jul 26, 2023

@open-telemetry/docs-approvers please take a look as well.

I think what @theletterf is looking for is similar to what we want to accomplish with the registry eventually. We had multiple discussions in this direction in the past, here are a few of the most recent from @punya and @TylerHelmuth:

We also had a general discussion on filling the registry automatically a few months back (open-telemetry/opentelemetry.io#1855). As @austinlparker stated "this has been a recurring idea for almost five years now (it was suggested for the OpenTracing registry, which was the precursor to the OpenTelemetry one)." What we have as of today is a semi-automatic mechanism (https://github.com/open-telemetry/opentelemetry.io/tree/main/scripts/registry-scanner) that we run from time to time to capture new components, but it still requires a lot of manual intervention and it only provides basic details on a component (name, title, some tags, etc.)

I am a big fan of having "a reliable single source of truth for OTel settings, metrics, etc.", which can be sourced by opentelemetry.io or any other consumer (vendor docs, etc.), but it is a really hard task to tackle: there is the technological problem (read this for details) and a people problem, as this would require a few individuals to invest time & brain power into making this happen and then keep the maintanance up. So as a first step I think there should be a few individuals who want to rally around this problem and look into it more deeply.

@theletterf
Copy link
Member Author

@svrnm I'd definitely love to help, if anything from a requirements/testing perspective. Others at Splunk might also want to contribute (@pmcollins @atoulme).

@cartermp
Copy link
Contributor

The one point of clarification I'd like to add to this is that we shouldn't try to create a system for components that already have good automated reference documentation, namely language SDKs that emit this information as a part of a build/release process.

@theletterf
Copy link
Member Author

@cartermp Could I see an example of that? The idea is that the automated reference documentation is provided as YAML, not as Markdown. Markdown docs are hard to collect and embed, while YAML can be easily processed for a variety of channels, including downstream.

@cartermp
Copy link
Contributor

This is one example: https://pkg.go.dev/go.opentelemetry.io/otel/sdk/trace

We intentionally deferred to language SDKs to emit their reference documentation using language-native toolsets, since those often have the best support for things. I wouldn't support trying to force a yaml-based system on them, especially since it would likely imply them having to come up with some manual or psuedo-manual process to keep them up to date rather than relying on the well-tested systems built for their language already.

@theletterf
Copy link
Member Author

@cartermp That's fine. The scope of what I refer to is more restricted and isn't concerned with developer / manual instrumentation reference docs. Examples of things I'm referring to:

We're currently generating those from the code. We could go ahead and continue producing these downstream, I guess, but I'd much rather have the whole community benefit from data that could be easily turned into docs anywhere.

@trask
Copy link
Member

trask commented Sep 17, 2024

@theletterf @svrnm do you want to transfer this to the docs repo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants