Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metricbeat][Aerospike] Add support for basic auth #41233

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

herrBez
Copy link
Contributor

@herrBez herrBez commented Oct 14, 2024

Proposed commit message

[Metricbeat][Aerospike] Add support for Basic Auth and update aerospike-client-go dependency

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

No user impact the old configuration still works

Author's Checklist

Context

As of October 14, 2024, only Aerospike DB version 6.1 and above is supported by the vendor. More details can be found here: Aerospike Platform Support.

The supported versions of the Aerospike client libraries are listed here: Aerospike Client Library Matrix.

Currently, Beats integrates version 1.27.1 of the aerospike-client-go library, which was released in 2017 and is no longer supported by the vendor.

In this pull request (PR), we upgrade the dependency to version 7 and add support for Basic Authentication for Enterprise Edition (EE) of Aerospike,

Please note that Aerospike version 7 introduced several changes to the metrics (some metrics that the metricset is using have been renamed, and others removed). Details can be found here: Aerospike 7.0 Metrics Changes. To keep the scope of this PR focused, I have opted to implement this change first and will submit a separate PR to address the metrics changes (I have already implemented the code for the change).

Final note, we distinguish between CE (community edition) and EE (enterprise edition) also in the docker images.

How to test this PR locally

  1. Enable the Aerospike Module and add the username: admin and password: admin:
# Module: aerospike
# Docs: https://www.elastic.co/guide/en/beats/metricbeat/main/metricbeat-module-aerospike.html

- module: aerospike
  #metricsets:
  #  - namespace
  period: 10s
  hosts: ["localhost:3000"]

  # Username of hosts. Empty by default.
  username: admin

  # Password of hosts. Empty by default.
  password: admin
  1. Create a file with the following configuration name aerospike-basic-auth.conf:
Aerospike 7 config-file
service {
	feature-key-file /etc/aerospike/features.conf
}
logging {
	# Send log messages to stdout
	console {
		context any info
	}
}
network {
	service {
		address any
		port 3000
	}
	heartbeat {
		mode mesh
		address local
		port 3002
		interval 150
		timeout 10
	}
	fabric {
		address local
		port 3001
	}
}

namespace test {
	default-ttl 30d # use 0 to never expire/evict.
	memory-size 1G
	nsup-period 120
	replication-factor 1
	storage-engine device {
		data-in-memory false # if true, in-memory, persisted to the filesystem
		file /opt/aerospike/data/test.dat
		filesize 4G
		read-page-cache true
	}
}

# This activates security
# Default credentials: `admin:admin`
security {
}
Aerospike 6 config-file
service {
	cluster-name aerospike
	feature-key-file /etc/aerospike/features.conf
}
logging {
	# Send log messages to stdout
	console {
		context any info
	}
}
network {
	service {
		address any
		port 3000
	}
	heartbeat {
		mode mesh
		address local
		port 3002
		interval 150
		timeout 10
	}
	fabric {
		address local
		port 3001
	}
}

namespace test {
	default-ttl 30d # use 0 to never expire/evict.
	memory-size 1G
	nsup-period 120
	replication-factor 1
	storage-engine device {
		data-in-memory false # if true, in-memory, persisted to the filesystem
		file /opt/aerospike/data/test.dat
		filesize 4G
		read-page-cache true
	}
}

security {
}
  1. Export the following variable export AEROSPIKE_VERSION=ee-6.4.0.7_2 to test with version 6 of Aerospike

  2. Use the following docker-compose.yaml

services:
  aerospike:
    # image: docker.elastic.co/integrations-ci/beats-aerospike:${AEROSPIKE_VERSION:-ee-7.2.0.1_1}
    image: aerospike:${AEROSPIKE_VERSION:-ee-7.2.0.1_1}
    build:
      args:
        AEROSPIKE_VERSION: ${AEROSPIKE_VERSION:-ee-7.2.0.1_1}
    volumes:
      - ./aerospike-basic-auth.conf:/opt/aerospike/etc/aerospike.conf:ro
    command:
    - "--config-file"
    - "/opt/aerospike/etc/aerospike.conf"
    ports:
      - 3000:3000
  1. ./metricbeat test modules aerospike

Please note that since some metrics have been renamed with Aerospike 7, the answer with this version will contain some empty metrics.

Related issues

Use cases

Monitor an Aerospike Cluster protected by Basic Auth

Screenshots

N/A

Logs

Not relevant

@herrBez herrBez added enhancement backport-8.x Automated backport to the 8.x branch with mergify labels Oct 14, 2024
@herrBez herrBez requested review from a team as code owners October 14, 2024 19:58
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 14, 2024
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Oct 15, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 15, 2024
@pierrehilbert pierrehilbert added needs_team Indicates that the issue/PR needs a Team:* label Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team labels Oct 15, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 15, 2024
Copy link
Contributor

mergify bot commented Oct 18, 2024

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b aerospike-basic-auth upstream/aerospike-basic-auth
git merge upstream/main
git push upstream aerospike-basic-auth

@herrBez herrBez added the backport-8.16 Automated backport with mergify label Oct 22, 2024
@herrBez
Copy link
Contributor Author

herrBez commented Oct 22, 2024

BTW may I ask what's exactly failing in the check? I don't have access to buildkite output.

@ishleenk17
Copy link
Contributor

ishleenk17 commented Oct 23, 2024

BTW may I ask what's exactly failing in the check? I don't have access to buildkite output.

The Integration test for namespace are failing. Can you confirm if the new api method used is correct. Looks like either there ia connection issue with aerospike or it is not able to fetch the namespace correctly for this metricset.

Screenshot 2024-10-23 at 12 47 33 PM

@herrBez
Copy link
Contributor Author

herrBez commented Oct 23, 2024

Hi thanks for sending the screenshot. The method is correct.

When I test it locally, I receive this error:

Failed to connect to hosts: [192.168.0.2:3000]\n ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: : Node BB90200A8C04202 (192.168.0.2:3000) is not yet fully initialized'

Meaning that when perform the test the Aerospike is not initialized yet.

Testing it multiple times some times passes sometimes not without a clear reason. The Dockerfile simply checks if the port is open, but I am afraid it's not sufficient. We probably need to improve the HEALTHCHECK to make sure that Aerospike is up-and-running (for real) or roll-back to the goold-old version 3.9.0 that worked before without particular issues.

(As a test I rolled back to version 3.9.0 in the below commit)

herrBez and others added 5 commits October 23, 2024 19:24
@herrBez
Copy link
Contributor Author

herrBez commented Oct 30, 2024

Hi there,
I changed the healthcheck of the container to actually verify if Aerospike is up-and-running. Apparently, Aerospike is opening the port before being fully initialized.

Now the command MODULE=aerospike mage -v pythonIntegTest is successful in a consistent manner (at least in my environment). Can you help me understand if this check or another is still failing? (sadly, I don't have access to buildkite).

@leehinman
Copy link
Contributor

@herrBez looks like the Aerospike python integration test is failing

module/aerospike/test_aerospike.py F                                                                                                                     [ 11%]
tests/system/test_base.py s.......                                                                                                                       [100%]
=========================================================================== FAILURES ===========================================================================
_____________________________________________________________________ Test.test_namespace ______________________________________________________________________
self = <test_aerospike.Test testMethod=test_namespace>
    @unittest.skipUnless(metricbeat.INTEGRATION_TESTS, "integration test")
    def test_namespace(self):
        """
        aerospike namespace metricset test
        """
>       self.check_metricset("aerospike", "namespace", self.get_hosts(), self.FIELDS)
module/aerospike/test_aerospike.py:17:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/system/metricbeat.py:135: in check_metricset
    self.assertCountEqual(self.de_dot(fields), evt.keys())
E   AssertionError: Element counts were not equal:
E   First has 1, Second has 0:  'aerospike'
E   First has 0, Second has 1:  'error'
-------------------------------------------------------------------- Captured stdout setup ---------------------------------------------------------------------
Step 1/3 : ARG AEROSPIKE_VERSION
Step 2/3 : FROM aerospike/aerospike-server-enterprise:${AEROSPIKE_VERSION}
7.2.0.1: Pulling from aerospike/aerospike-server-enterprise
Digest: sha256:d228a501081281c1f716f2514343859e7e55e083fa160ecbfcf9acaee485d02e
Status: Downloaded newer image for aerospike/aerospike-server-enterprise:7.2.0.1
 ---> 5e3c0543a90c
Step 3/3 : HEALTHCHECK --interval=1s --retries=90 CMD asinfo -v namespaces | grep -i -q "test" || exit 1
 ---> Running in c2d71be1f750
Removing intermediate container c2d71be1f750
 ---> 1e9e239c77e0
Successfully built 1e9e239c77e0
Successfully tagged docker.elastic.co/integrations-ci/beats-aerospike:7.2.0.1-1
-------------------------------------------------------------------- Captured stderr setup ---------------------------------------------------------------------
Pulling aerospike ...
Pulling aerospike ... done
Creating aerospike_7918a246f6f8_aerospike_1 ...
Creating aerospike_7918a246f6f8_aerospike_1 ... done
---------------------------------------------------------------------- Captured log setup ----------------------------------------------------------------------
WARNING  compose.project:project.py:811 Some service image(s) must be built from source by running:
    docker-compose build aerospike
WARNING  compose.service:service.py:365 Image for service aerospike was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
--------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------
[{'@timestamp': '2024-10-29T21:06:41.711Z', 'ecs': {'version': '8.0.0'}, 'host': {'name': 'bk-agent-prod-gcp-1730235630334228795'}, 'agent': {'name': 'bk-agent-prod-gcp-1730235630334228795', 'type': 'metricbeat', 'version': '9.0.0', 'ephemeral_id': '75cf287d-4f9e-4184-845e-60e7ebe41093', 'id': 'd0295744-6d8a-4ef1-9f64-9c9169d86416'}, 'metricset': {'name': 'namespace', 'period': 1000}, 'event': {'module': 'aerospike', 'duration': 2077130, 'dataset': 'aerospike.namespace'}, 'service': {'type': 'aerospike'}, 'error': {'message': 'error connecting to Aerospike: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [172.19.0.2:3000]\n  ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Node BB9020013AC4202 (172.19.0.2:3000) is not yet fully initialized'}}]
{'@timestamp': '2024-10-29T21:06:41.711Z', 'ecs': {'version': '8.0.0'}, 'host': {'name': 'bk-agent-prod-gcp-1730235630334228795'}, 'agent': {'name': 'bk-agent-prod-gcp-1730235630334228795', 'type': 'metricbeat', 'version': '9.0.0', 'ephemeral_id': '75cf287d-4f9e-4184-845e-60e7ebe41093', 'id': 'd0295744-6d8a-4ef1-9f64-9c9169d86416'}, 'metricset': {'name': 'namespace', 'period': 1000}, 'event': {'module': 'aerospike', 'duration': 2077130, 'dataset': 'aerospike.namespace'}, 'service': {'type': 'aerospike'}, 'error': {'message': 'error connecting to Aerospike: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [172.19.0.2:3000]\n  ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Node BB9020013AC4202 (172.19.0.2:3000) is not yet fully initialized'}}
['@timestamp', 'agent', 'metricset.name', 'metricset.host', 'metricset.module', 'metricset.rtt', 'host.name', 'service.name', 'event', 'ecs', 'aerospike']
------------------------------------------------------------------- Captured stderr teardown -------------------------------------------------------------------
Stopping aerospike_7918a246f6f8_aerospike_1 ...
Stopping aerospike_7918a246f6f8_aerospike_1 ... done
Removing aerospike_7918a246f6f8_aerospike_1 ...
Removing aerospike_7918a246f6f8_aerospike_1 ... done
======================================================================= warnings summary =======================================================================
../../../../../../opt/venv/lib/python3.11/site-packages/paramiko/pkey.py:100
  /opt/venv/lib/python3.11/site-packages/paramiko/pkey.py:100: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
    "cipher": algorithms.TripleDES,
../../../../../../opt/venv/lib/python3.11/site-packages/paramiko/transport.py:259
  /opt/venv/lib/python3.11/site-packages/paramiko/transport.py:259: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
    "class": algorithms.TripleDES,
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
------------------------------ generated xml file: /go/src/github.com/elastic/beats/metricbeat/build/TEST-python-integration.xml -------------------------------
===================================================================== slowest 20 durations =====================================================================
29.86s setup    metricbeat/tests/system/test_base.py::Test::test_dashboards
7.82s setup    metricbeat/module/aerospike/test_aerospike.py::Test::test_namespace
1.53s call     metricbeat/module/aerospike/test_aerospike.py::Test::test_namespace
0.98s call     metricbeat/tests/system/test_base.py::Test::test_index_management
0.70s call     metricbeat/tests/system/test_base.py::Test::test_export_index_pattern
0.69s call     metricbeat/tests/system/test_base.py::Test::test_export_index_pattern_migration
0.66s call     metricbeat/tests/system/test_base.py::Test::test_export_template
0.41s teardown metricbeat/module/aerospike/test_aerospike.py::Test::test_namespace
0.41s call     metricbeat/tests/system/test_base.py::Test::test_start_stop
0.35s call     metricbeat/tests/system/test_base.py::Test::test_export_ilm_policy
0.34s call     metricbeat/tests/system/test_base.py::Test::test_export_config
0.29s teardown metricbeat/tests/system/test_base.py::Test::test_start_stop
0.03s call     metricbeat/tests/system/test_base.py::Test::test_dashboards
(7 durations < 0.005s hidden.  Use -vv to show these durations.)
=================================================================== short test summary info ====================================================================
FAILED module/aerospike/test_aerospike.py::Test::test_namespace - AssertionError: Element counts were not equal:

@@ -47,6 +57,24 @@ func ParseClientPolicy(config Config) (*as.ClientPolicy, error) {
clientPolicy.TlsConfig = tlsconfig.ToConfig()
}

if len(config.User) > 0 && len(config.Password) > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT]

As you're checking for empty string, it's more readable do it explicitly:

Suggested change
if len(config.User) > 0 && len(config.Password) > 0 {
if config.User != "" && config.Password != "" {

Comment on lines +185 to +190
Name: "Password is set and user is not set",
Config: Config{
Password: samplePassword,
},
expectedClientPolicy: as.NewClientPolicy(),
expectedErr: nil,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about Aerospike, but shouldn't it be an error case? Just setting the password?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants