diff --git a/doc/monitoring/api_reference.rst b/doc/monitoring/api_reference.rst index 2f1e57b4..42a65c10 100644 --- a/doc/monitoring/api_reference.rst +++ b/doc/monitoring/api_reference.rst @@ -24,6 +24,15 @@ A collector represents one or more observations that change over time. counter ~~~~~~~ +A counter is a cumulative metric that denotes a single monotonically increasing counter. Its value might only +increase or be reset to zero on restart. For example, you can use a counter to represent the number of requests +served, tasks completed, or errors. + +Don't use a counter to expose a value that can decrease. For example, don't use this metric to mark the number of +currently running processes. Use a :ref:`gauge ` type instead. + +The design is based on the `Prometheus counter `__. + .. function:: counter(name [, help, metainfo]) Register a new counter. @@ -86,6 +95,13 @@ counter gauge ~~~~~ +A gauge is a metric that denotes a single numerical value that can arbitrarily increase and decrease. + +The gauge type is typically used for measured values like temperature or current memory usage. Also, +it might be used for the values that can go up or down, for example, the number of concurrent requests. + +The design is based on the `Prometheus gauge `__. + .. function:: gauge(name [, help, metainfo]) Register a new gauge. @@ -129,6 +145,99 @@ gauge histogram ~~~~~~~~~ +A histogram metric is used to collect and analyze +statistical data about the distribution of values within the application. +Unlike metrics that only allow to track the average value or quantity of events, a histogram allows +to see the distribution of values in detail and uncover hidden dependencies. + +Suppose there are a lot of individual measurements that you don't want or can't store, and the +aggregated information (the distribution of values across ranges) is enough to figure out the pattern. +In this case, a histogram is used. + +Each histogram provides several measurements: + +- total count (``_count``) +- sum of measured values (``_sum``) +- distribution across buckets (``_bucket``) + +Consider the following problem: you want to know how often the observed value is in the specific range (bucket). + +.. image:: images/histogram-buckets.png + :align: center + +For example, the observed values are 8, 7, 6, 8, 1, 7, 4, 8. +Then, in the ranges: + +* In the interval [0, 2], there is 1 measurement. +* In the interval [0, 4], there are 2 measurements. +* In the interval [0, 6], there are 3 measurements. +* In the interval [0, infinity], there are 8 measurements (equal to the ``histogram_demo_count`` value). + +.. code-block:: json + + { + "label_pairs": { + "le": 2, + "alias": "my-tnt-app" + }, + "timestamp": 1680174378390303, + "metric_name": "histogram_demo_bucket", + "value": 1 + }, + { + "label_pairs": { + "le": 4, + "alias": "my-tnt-app" + }, + "timestamp": 1680174378390303, + "metric_name": "histogram_demo_bucket", + "value": 2 + }, + { + "label_pairs": { + "le": 6, + "alias": "my-tnt-app" + }, + "timestamp": 1680174378390303, + "metric_name": "histogram_demo_bucket", + "value": 3 + }, + { + "label_pairs": { + "le": "inf", + "alias": "my-tnt-app" + }, + "timestamp": 1680174378390303, + "metric_name": "histogram_demo_bucket", + "value": 8 + }, + +.. image:: images/histogram.png + :align: center + +The metric also displays the count of measurements and their sum: + +.. code-block:: json + + { + "label_pairs": { + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "histogram_demo_count", + "value": 8 + }, + { + "label_pairs": { + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "histogram_demo_sum", + "value": 49 + }, + +The design is based on the `Prometheus histogram `__. + .. function:: histogram(name [, help, buckets, metainfo]) Register a new histogram. @@ -187,6 +296,119 @@ histogram summary ~~~~~~~ +A summary metric is used to collect statistical data +about the distribution of values within the application. + +Each summary provides several measurements: + +* total count of measurements +* sum of measured values +* values at specific quantiles + +Similar to histograms, a summary also operates with value ranges. However, unlike histograms, +it uses quantiles (defined by a number between 0 and 1) for this purpose. In this case, +it is not required to define fixed boundaries. For summary type, the ranges depend +on the measured values and the number of measurements. + +Suppose the example series of measurements are sorted in ascending order: +1, 4, 6, 7, 7, 8, 8, 8. + +Quantiles: + +* Quantile 0 is the value of the first, minimum element. In this example, it's 1. +* Quantile 1 is the value of the last, maximum element. In this example, it's 8. +* Quantile 0.5 is the value of the median element. In this example, it's 7. This means that the smaller + half of the measurements is a range of values from 1 to 7. The larger one is a range of values from 7 to 8. + +Note that calculating quantiles requires resources, so it makes sense to calculate no +more than one, for example: 0.95 -- the majority of measurements. + +With a large number of measurements per second, a significant amount of memory is required to +store them all. The array is compressed to reduce memory consumption. The degree of compression is determined by +an acceptable error rate. In application, error rates mostly from 1% to 10%. This means that a +quantile of 0.50 with a 10% error from the example above returns a value in the range of 6.65...7.35 instead of 7. + +Additionally, a summary metric doesn't store values for the whole application's lifetime. This metric +uses a sliding window divided into sections (buckets) where measurements are stored. + +.. image:: images/summary-buckets.png + :align: center + +Note that buckets in histograms and buckets in quantiles within summaries have different meanings. + +.. code-block:: lua + + local summary_demo = metrics.summary( + 'summary_demo', -- metric name + 'Summary demo', -- description + { + [0.5] = 0.01, -- quantile 0.50 with 1% error + [0.95] = 0.01, -- quantile 0.95 with 1% error + [0.99] = 0.01, -- quantile 0.99 with 1% error + }, + { + max_age_time = 60, -- duration of each bucket in seconds + age_buckets_count = 5 -- total number of buckets in the sliding window + -- window duration = max_age_time * age_buckets_count seconds, or in + -- this case = 5 minutes + } + ) + +The metric like in the example above returns the following measurements for the specified quantiles: + +.. code-block:: json + + { + "label_pairs": { + "quantile": 0.5, + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "summary_demo", + "value": 7 + }, + { + "label_pairs": { + "quantile": 0.95, + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "summary_demo", + "value": 8 + }, + { + "label_pairs": { + "quantile": 0.99, + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "summary_demo", + "value": 8 + }, + +Also, the metric exposes the count of measurements and the sum of observations: + +.. code-block:: json + + { + "label_pairs": { + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "summary_demo_count", + "value": 8 + }, + { + "label_pairs": { + "alias": "my-tnt-app" + }, + "timestamp": 1680180929162484, + "metric_name": "summary_demo_sum", + "value": 49 + }, + +The design is based on the `Prometheus summary `__. + .. function:: summary(name [, help, objectives, params, metainfo]) Register a new summary. Quantile computation is based on the diff --git a/doc/monitoring/getting_started.rst b/doc/monitoring/getting_started.rst index 96be8f4a..029c8e31 100644 --- a/doc/monitoring/getting_started.rst +++ b/doc/monitoring/getting_started.rst @@ -1,282 +1,307 @@ .. _monitoring-getting_started: -Monitoring: getting started -=========================== +Getting started with monitoring +=============================== -.. _monitoring-getting_started-tarantool: +If you use Tarantool version below `2.11.1 `__, +it is necessary to install the latest version of ``metrics`` first. For details, +see :ref:`Installing the metrics module `. -Tarantool ---------- +.. _monitoring-getting_started-usage: -First, install the ``metrics`` package: +Using the metrics module +------------------------ -.. code-block:: console +.. note:: - $ cd ${PROJECT_ROOT} - $ tarantoolctl rocks install metrics + The module is also used in applications based on the Cartridge framework. For details, + see the :ref:`Getting started with Cartridge application ` section. -Next, require it in your code: +#. First, set the instance name and start to collect the standard set of metrics: -.. code-block:: lua + .. code-block:: lua - local metrics = require('metrics') + metrics.cfg{labels = {alias = 'my-instance'}} -Enable default Tarantool metrics such as network, memory, operations, etc. -You may also set a global label for your metrics: + When using a metrics module version below **0.17.0**, use the following snippet instead of ``metrics.cfg(...)``: -.. code-block:: lua + .. code-block:: lua + + metrics.set_global_labels({alias = 'my-instance'}) + metrics.enable_default_metrics() - metrics.cfg{alias = 'alias'} +#. Then, add a handler to expose metric values. -Initialize the Prometheus exporter or export metrics in another format: + For JSON format: -.. code-block:: lua + .. code-block:: lua + + local json_exporter = require('metrics.plugins.json') + local function http_metrics_handler(request) + return request:render({ text = json_exporter.export() }) + end + + For Prometheus format: + + .. code-block:: lua - local httpd = require('http.server') - local http_handler = require('metrics.plugins.prometheus').collect_http + local prometheus_exporter = require('metrics.plugins.prometheus').collect_http + To learn how to extend metrics with custom data, check the :ref:`API reference `. - httpd.new('0.0.0.0', 8088) - :route({path = '/metrics'}, function(...) - return http_handler(...) - end) - :start() +#. Start the HTTP server and expose metrics: + .. code-block:: lua + + local http_server = require('http.server') + local server = http_server.new('0.0.0.0', 8081) + server:route({path = '/metrics'}, http_metrics_handler) + server:start() + +The metric values are now available via the ``http://localhost:8081/metrics`` URL: + +.. code-block:: json + + [ + { + "label_pairs": { + "alias": "my-instance" + }, + "timestamp": 1679663602823779, + "metric_name": "tnt_vinyl_disk_index_size", + "value": 0 + }, + { + "label_pairs": { + "alias": "my-instance" + }, + "timestamp": 1679663602823779, + "metric_name": "tnt_info_memory_data", + "value": 39272 + }, + { + "label_pairs": { + "alias": "my-instance" + }, + "timestamp": 1679663602823779, + "metric_name": "tnt_election_vote", + "value": 0 + } + ] + +The data can be visualized in +`Grafana dashboard `__. + +The full source example is listed below: + +.. code-block:: lua + + -- Import modules + local metrics = require('metrics') + local http_server = require('http.server') + local json_exporter = require('metrics.plugins.json') + + -- Define helper functions + local function http_metrics_handler(request) + return request:render({ text = json_exporter.export() }) + end + + -- Start the database box.cfg{ - listen = 3302 + listen = 3301, } -Now you can use the HTTP API endpoint ``/metrics`` to collect your metrics -in the Prometheus format. To learn how to obtain custom metrics, check the -:ref:`API reference `. + -- Configure the metrics module + metrics.cfg{labels = {alias = 'my-tnt-app'}} + + -- Run the web server + local server = http_server.new('0.0.0.0', 8081) + server:route({path = '/metrics'}, http_metrics_handler) + server:start() .. _monitoring-getting_started-http_metrics: -Collect HTTP metrics --------------------- +Collecting HTTP metrics +----------------------- -To enable the collection of HTTP metrics, you need to create a collector first. +To enable the collection of HTTP metrics, wrap a handler with a ``metrics.http_middleware.v1`` function: .. code-block:: lua + local metrics = require('metrics') local httpd = require('http.server').new(ip, port) -- Create a summary collector for latency - local collector = metrics.http_middleware.build_default_collector('summary') + metrics.http_middleware.configure_default_collector('summary') -- Set a route handler for latency summary collection - httpd:route({ path = '/path-1', method = 'POST' }, metrics.http_middleware.v1(handler_1, collector)) - httpd:route({ path = '/path-2', method = 'GET' }, metrics.http_middleware.v1(handler_2, collector)) + httpd:route({ path = '/path-1', method = 'POST' }, metrics.http_middleware.v1(handler_1)) + httpd:route({ path = '/path-2', method = 'GET' }, metrics.http_middleware.v1(handler_2)) -- Start HTTP routing httpd:start() +.. note:: + + By default, the ``http_middleware`` uses the :ref:`histogram ` collector + for backward compatibility reasons. + To collect HTTP metrics, use the :ref:`summary ` type instead. + You can collect all HTTP metrics with a single collector. -If you're using the default +If you use the default :ref:`Grafana dashboard `, don't change the default collector name. Otherwise, your metrics won't appear on the charts. +.. _monitoring-getting_started-custom_metric: -.. _monitoring-getting_started-instance_health_check: +Creating custom metric +---------------------- -Instance health check ---------------------- +You can create your own metric in two ways, depending on when you need to take measurements: -In production environments, Tarantool Cartridge usually has a large number of so-called -routers -- Tarantool instances that handle input load. -Various load balancers help distribute that load evenly. -However, any load balancer has to know -which routers are ready to accept the load at the moment. -The Tarantool metrics library has a special plugin that creates an HTTP handler, -which the load balancer can use to check the current state of any Tarantool instance. -If the instance is ready to accept the load, it will return a response with a 200 status code, -and if not, with a 500 status code. +* at any arbitrary moment of time +* when the data collected by metrics is requested -.. _monitoring-getting_started-cartridge_role: +To create custom metrics at any arbitrary moment of time, do the following: -Cartridge role --------------- +#. Create the collector: -``cartridge.roles.metrics`` is a -`Tarantool Cartridge `__ role. -It allows using default metrics in a Cartridge application and managing them -via Cartridge configuration. + .. code-block:: lua -**Usage** + local response_counter = metrics.counter('response_counter', 'Response counter') -#. Add ``cartridge-metrics-role`` package to the dependencies in the ``.rockspec`` file. +#. Take a measurement at the appropriate place, for example, in an API request handler: - .. code-block:: lua + .. code-block:: lua - dependencies = { - ... - 'cartridge-metrics-role >= 0.1.0-1', - ... - } + local function check_handler(request) + local label_pairs = { + path = request.path, + method = request.method, + } + response_counter:inc(1, label_pairs) + -- ... + end - If you're using older version of metrics package, you need to add ``metrics`` package - instead of ``cartridge-metrics-role``. +To create custom metrics when the data collected by metrics is requested, do the following: - .. code-block:: lua +#. Create the collector: - dependencies = { - ... - 'metrics == 0.17.0-1', - ... - } + .. code-block:: lua - Cartridge role is present in package versions from **0.3.0** to **0.17.0**. + local other_custom_metric = metrics.gauge('other_custom_metric', 'Other custom metric') -#. Make sure that ``cartridge.roles.metrics`` is included - in the roles list in ``cartridge.cfg`` - in your entry point file (for example, ``init.lua``): +#. Take a measurement at the time of requesting the data collected by metrics: - .. code-block:: lua + .. code-block:: lua - local ok, err = cartridge.cfg({ - ... - roles = { - ... - 'cartridge.roles.metrics', - ... - }, - }) + metrics.register_callback(function() + -- ... + local label_pairs = { + category = category, + } + other_custom_metric:set(current_value, label_pairs) + end) -#. To get metrics via API endpoints, use ``set_export``. +The full example is listed below: - .. note:: +.. code-block:: lua - ``set_export`` has lower priority than clusterwide configuration - and may be overridden by the metrics configuration. + -- Import modules + local metrics = require('metrics') + local http_server = require('http.server') + local json_exporter = require('metrics.plugins.json') - .. code-block:: lua + local response_counter = metrics.counter('response_counter', 'Response counter') - local metrics = require('cartridge.roles.metrics') - metrics.set_export({ - { - path = '/path_for_json_metrics', - format = 'json' - }, - { - path = '/path_for_prometheus_metrics', - format = 'prometheus' - }, - { - path = '/health', - format = 'health' - } - }) + -- Define helper functions + local function http_metrics_handler(request) + return request:render({ text = json_exporter.export() }) + end - You can add several endpoints of the same format with different paths. - For example: + local function check_handler(request) + local label_pairs = { + path = request.path, + method = request.method, + } + response_counter:inc(1, label_pairs) + return request:render({ text = 'ok' }) + end - .. code-block:: lua + -- Start the database + box.cfg{ + listen = 3301, + } - metrics.set_export({ - { - path = '/path_for_json_metrics', - format = 'json' - }, - { - path = '/another_path_for_json_metrics', - format = 'json' - }, - }) + -- Configure the metrics module + metrics.set_global_labels{alias = 'my-tnt-app'} - The metrics will be available on the path specified in ``path``, in the format - specified in ``format``. + -- Run the web server + local server = http_server.new('0.0.0.0', 8081) + server:route({path = '/metrics'}, http_metrics_handler) + server:route({path = '/check'}, check_handler) + server:start() -#. Since version **0.6.0**, the metrics role is permanent and enabled on instances by default. - If you use old version of metrics, you should enable the role in the interface: +The result looks in the following way: - .. image:: images/role-enable.png - :align: center +.. code-block:: json -#. After the role has been initialized, the default metrics will be enabled - and the global label ``alias`` will be set. - **Note** that the ``alias`` label value is set by the ``alias`` or ``instance_name`` - instance :ref:`configuration option ` (since **0.6.1**). + [ + { + "label_pairs": { + "path": "/check", + "method": "GET", + "alias": "my-tnt-app" + }, + "timestamp": 1688385933874080, + "metric_name": "response_counter", + "value": 1 + } + ] - You can use the functionality of any - metrics package by getting it as a Cartridge service - and calling it with ``require`` like a regular package: +.. _monitoring-getting_started-warning: - .. code-block:: lua +Possible limitations +~~~~~~~~~~~~~~~~~~~~ - local cartridge = require('cartridge') - local metrics = cartridge.service_get('metrics') +The module allows to add your own metrics, but there are some subtleties when working with specific tools. -#. Since Tarantool Cartridge ``2.4.0``, you can set a zone for each - instance in the cluster. When a zone is set, all the metrics on the instance - receive the ``zone`` label. +When adding your custom metric, it's important to ensure that the number of label value combinations is +kept to a minimum. Otherwise, combinatorial explosion may happen in the timeseries database with metrics values +stored. Examples of data labels: -#. To change the HTTP path for a metric in **runtime**, - you can use the configuration below. - `Learn more about Cartridge configuration `_). - It is not recommended to set up the metrics role in this way. Use ``set_export`` instead. +* `Labels `__ in Prometheus +* `Tags `__ in InfluxDB - .. code-block:: yaml +For example, if your company uses InfluxDB for metric collection, you could potentially disrupt the entire +monitoring setup, both for your application and for all other systems within the company. As a result, +monitoring data is likely to be lost. - metrics: - export: - - path: '/path_for_json_metrics' - format: 'json' - - path: '/path_for_prometheus_metrics' - format: 'prometheus' - - path: '/health' - format: 'health' +Example: - .. image:: images/role-config.png - :align: center +.. code-block:: lua -#. You can set custom global labels with the following configuration: + local some_metric = metrics.counter('some', 'Some metric') - .. code-block:: yaml + -- THIS IS POSSIBLE + local function on_value_update(instance_alias) + some_metric:inc(1, { alias = instance_alias }) + end - metrics: - export: - - path: '/metrics' - format: 'json' - global-labels: - my-custom-label: label-value + -- THIS IS NOT ALLOWED + local function on_value_update(customer_id) + some_metric:inc(1, { customer_id = customer_id }) + end - Another option is to invoke the ``set_default_labels`` function in ``init.lua``: +In the example, there are two versions of the function ``on_value_update``. The top version labels +the data with the cluster instance's alias. Since there's a relatively small number of nodes, using +them as labels is feasible. In the second case, an identifier of a record is used. If there are many +records, it's recommended to avoid such situations. - .. code-block:: lua +The same principle applies to URLs. Using the entire URL with parameters is not recommended. +Use a URL template or the name of the command instead. - local metrics = require('cartridge.roles.metrics') - metrics.set_default_labels({ ['my-custom-label'] = 'label-value' }) - -#. You can use the configuration below to choose the default metrics to be exported. - If you add the include section, only the metrics from this section will be exported: - - .. code-block:: yaml - - metrics: - export: - - path: '/metrics' - format: 'json' - # export only vinyl, luajit and memory metrics: - include: - - vinyl - - luajit - - memory - - If you add the exclude section, - the metrics from this section will be removed from the default metrics list: - - .. code-block:: yaml - - metrics: - export: - - path: '/metrics' - format: 'json' - # export all metrics except vinyl, luajit and memory: - exclude: - - vinyl - - luajit - - memory - - For the full list of default metrics, check the - :ref:`API reference `. +In essence, when designing custom metrics and selecting labels or tags, it's crucial to opt for a minimal +set of values that can uniquely identify the data without introducing unnecessary complexity or potential +conflicts with existing metrics and systems. diff --git a/doc/monitoring/getting_started_cartridge.rst b/doc/monitoring/getting_started_cartridge.rst new file mode 100644 index 00000000..557709b5 --- /dev/null +++ b/doc/monitoring/getting_started_cartridge.rst @@ -0,0 +1,272 @@ +.. _metrics-getting_started_cartridge: + +Getting started with Cartridge application +========================================== + +The ``metrics`` module is tightly integrated with +the `Cartridge `__ framework. +To enable this integration, a permanent role called ``metrics`` has been introduced. +To enable this role, follow the steps below. + +.. _getting_started_cartridge-setup: + +Module setup +------------ + +First, install the latest version of ``metrics``. For details, +:ref:`check the installation guide `. + +Also, you need to install the separate ``cartridge-metrics-role`` rock. To do this: + +#. Add the ``cartridge-metrics-role`` package to the dependencies in the ``.rockspec`` file: + + .. code-block:: lua + + dependencies = { + ... + 'cartridge-metrics-role', + ... + } + +#. Next, install the missing dependencies: + + .. code-block:: shell + + tt rocks make + # OR # + cartridge build + +After the ``cartridge-metrics-role`` installation, enable this package in the list of roles in ``cartridge.cfg``: + +.. code-block:: lua + + local ok, err = cartridge.cfg({ + roles = { + ... + 'cartridge.roles.metrics', + ... + }, + }) + +Then, configure the ``metrics`` module in either of two ways: + +* add the ``metrics`` configuration section to your cluster configuration; +* specify the configuration in the separate ``metrics.yml`` file. + +In the configuration, specify the response format and the addresses at which the commands are available: + +.. code-block:: yaml + + metrics: + export: + - path: '/path_for_json_metrics' + format: 'json' + - path: '/path_for_prometheus_metrics' + format: 'prometheus' + - path: '/health' + format: 'health' + +Learn more about `Cartridge configuration `__. + +.. note:: + + Instead of configuring the cluster configuration, you can also use the + `set_export `__ + command. + +Now the commands' data is accessible at the following addresses: + +.. code-block:: shell + + http://url:port/path_for_json_metrics + http://url:port/path_for_prometheus_metrics + http://url:port/health + +where ``url:port`` -- the address and Cartridge HTTP port of a specific instance of the application. + +You can visualize the data in +`Grafana dashboard `__. + +After the role has been initialized, the default metrics are enabled +and the global ``alias`` label is set. + +.. note:: + + Since **0.6.1**, the ``alias`` label value is set by the ``alias`` or ``instance_name`` + instance :ref:`configuration option `. + +You can use the functionality of any metrics package. +To do this, get the package as a Cartridge service and call it with the ``require()`` like a regular package: + +.. code-block:: lua + + local cartridge = require('cartridge') + local metrics = cartridge.service_get('metrics') + +.. _getting_started_cartridge-if_we_use_old_version: + +Additional steps for older versions of the metrics module +--------------------------------------------------------- + +Since version **0.6.0**, the ``metrics`` role is permanent and enabled on instances by default. +If you use an old version of ``metrics``, you need to enable the role in the interface first: + +.. image:: images/role-enable.png + :align: center + +.. _getting_started_cartridge-add_metrics_to_http_api_command: + +Adding metrics to HTTP API commands of the application +------------------------------------------------------ + +You can connect the standard ``http_server_request_latency`` metric to your application's HTTP API +commands. This metric records the number of invocations and the total execution time (latency) of +each individual command. To connect this, wrap each API handler with +the ``metrics.http_middleware.v1(...)`` function. + +Example: + +.. code-block:: lua + + local cartridge = require('cartridge') + local server = cartridge.service_get('httpd') -- get the HTTP server module + local metrics = cartridge.service_get('metrics') -- get the module of metrics + + local function http_app_api_handler(request) -- add test command + return request:render({ text = 'Hello world!!!' }) + end + + local server = http_server.new('0.0.0.0', 8081) + server:route({path = '/hello'}, metrics.http_middleware.v1(http_app_api_handler)) + server:start() + +When calling the ``cartridge.service_get('metrics')`` command as an application (usually in a router), +add a dependency of this role on the role of ``metrics``: + +.. code-block:: lua + + return { + ... + dependencies = { + ... + 'cartridge.roles.metrics', + } + } + +Now after the HTTP API calls ``hello`` at ``http://url:port/path_for_json_metrics``, +new data on these calls is available: + +.. code-block:: json + + { + "label_pairs": { + "path": "/hello", + "method": "ANY", + "status": 200, + "alias": "my-tnt-app" + }, + "timestamp": 1679668258972227, + "metric_name": "http_server_request_latency_count", + "value": 9 + }, + { + "label_pairs": { + "path": "/hello", + "method": "ANY", + "status": 200, + "alias": "my-tnt-app" + }, + "timestamp": 1679668258972227, + "metric_name": "http_server_request_latency_sum", + "value": 0.00008015199273359 + }, + +The default type for this metric is ``histogram``. However, +it's :ref:`recommended ` to use the ``summary`` type instead. + +.. _getting_started_cartridge-advanced_settings: + +Additional settings +------------------- + +* Since Tarantool Cartridge ``2.4.0``, you can set a zone for each + instance in the cluster. When a zone is set, all the metrics on the instance + receive the ``zone`` label. + +* You can set custom global labels with the following configuration: + + .. code-block:: yaml + + metrics: + export: + - path: '/metrics' + format: 'json' + global-labels: + my-custom-label: label-value + + Another option is to invoke the ``set_default_labels`` function in ``init.lua``: + + .. code-block:: lua + + local metrics = require('cartridge.roles.metrics') + metrics.set_default_labels({ ['my-custom-label'] = 'label-value' }) + +* You can use the configuration below to choose the default metrics to be exported. + If you add the ``include`` section, only the metrics from this section will be exported: + + .. code-block:: yaml + + metrics: + export: + - path: '/metrics' + format: 'json' + # export only vinyl, luajit and memory metrics: + include: + - vinyl + - luajit + - memory + + If you add the ``exclude`` section, + the metrics from this section will be removed from the default metrics list: + + .. code-block:: yaml + + metrics: + export: + - path: '/metrics' + format: 'json' + # export all metrics except vinyl, luajit and memory: + exclude: + - vinyl + - luajit + - memory + + For the full list of default metrics, check the + :ref:`API reference `. + +.. _getting_started_cartridge-custom_health_handle: + +Creating a custom health check format +------------------------------------- + +By default, the response of the health command contains a status code of + +* ``200`` -- if everything is okay, +* ``500`` -- if the instance is unhealthy. + +You can set your own response format in the following way: + +.. code-block:: lua + + local health = require('cartridge.health') + local metrics = cartridge.service_get('metrics') + + metrics.set_health_handler(function(req) + local resp = req:render{ + json = { + my_healthcheck_format = health.is_healthy() + } + } + resp.status = 200 + return resp + end) diff --git a/doc/monitoring/images/histogram-buckets.png b/doc/monitoring/images/histogram-buckets.png new file mode 100644 index 00000000..04592633 Binary files /dev/null and b/doc/monitoring/images/histogram-buckets.png differ diff --git a/doc/monitoring/images/histogram.png b/doc/monitoring/images/histogram.png new file mode 100644 index 00000000..c26ada12 Binary files /dev/null and b/doc/monitoring/images/histogram.png differ diff --git a/doc/monitoring/images/summary-buckets.png b/doc/monitoring/images/summary-buckets.png new file mode 100644 index 00000000..8918c73f Binary files /dev/null and b/doc/monitoring/images/summary-buckets.png differ diff --git a/doc/monitoring/index.rst b/doc/monitoring/index.rst index df876441..2faea34e 100644 --- a/doc/monitoring/index.rst +++ b/doc/monitoring/index.rst @@ -14,6 +14,8 @@ This chapter includes the following sections: :numbered: 0 getting_started + install + getting_started_cartridge metrics_reference api_reference plugins diff --git a/doc/monitoring/install.rst b/doc/monitoring/install.rst new file mode 100644 index 00000000..e3fe0880 --- /dev/null +++ b/doc/monitoring/install.rst @@ -0,0 +1,57 @@ +.. _install: + +Installing the metrics module +============================= + +.. note:: + + Since Tarantool version `2.11.1 `__, + the installation is not required. + +.. _install-rockspec: + +Installing metrics using the .rockspec file +------------------------------------------- + +Usually, all dependencies are included in the ``*.rockspec`` file of the application. +All dependencies are installed from this file. To do this: + +#. Add the ``metrics`` module to the dependencies in the ``.rockspec`` file: + + .. code-block:: lua + + dependencies = { + ... + 'metrics == 1.0.0', + ... + } + +#. Install the missing dependencies: + + .. code-block:: shell + + tt rocks make + # OR # + cartridge build + +.. _install-metrics_only: + +Installing the metrics module only +---------------------------------- + +To install only the ``metrics`` module, execute the following commands: + +#. Set current folder: + + .. code-block:: shell + + $ cd ${PROJECT_ROOT} + +#. Install the missing dependencies: + + .. code-block:: shell + + $ tt rocks install metrics + + where ``version`` -- the necessary version number. If omitted, then the version from the + ``master`` branch is installed.