Correct URL handling for Elasticsearch/OpenSearch #3577

webbnh · 2023-11-17T18:09:01Z

Now that we are deploying on an OpenSearch service which requires a username/password, remove assumptions around the contents of the ES URL configuration in the Server code.

Remove the check for a port from wait_for_uri(); add defaulting for the port number if it is omitted.
Rework _get_es_hosts() to return the whole URL instead of the decomposed dict; move the specification of the timeout value to its caller.
Add a ca_bundle configuration option to the Indexing configuration section and pointed the default value for it to /etc/pki/tls/certs/ca-bundle.crt.
Provide the ca_bundle value to the instantiations of the Elasticsearch client and to the Requests methods which target Elasticsearch/OpenSearch services.
Remove the unused logger parameter from get_es() and _get_es_hosts().
Clear some lint.

PBENCH-1302

dbutenhof

Makes me wonder if we should think about setting up an https ES instance in the CI. For now I'm tempted to just pretend nobody thought of that ... but I do worry that your uri breakdown isn't going to do what we want on the production instance, and right now we've got no way to test it short of deploying it...

lib/pbench/server/indexer.py

lib/pbench/common/__init__.py

lib/pbench/server/indexer.py

lib/pbench/test/unit/common/test_wait_for_uri.py

lib/pbench/server/indexer.py

dbutenhof

I find myself wondering about the implications if we install somewhere without a /etc/pki/tls/certs/ca-bundle.crt; but it seems that the CI ran with straight http Elasticsearch so at least that's covered, and I assume this will also work for the staging server.

I suppose we may find further problems when the Opensearch permissions are fixed (if that's really the problem), but for now this looks OK.

webbnh · 2023-12-04T15:03:18Z

I find myself wondering about the implications if we install somewhere without a /etc/pki/tls/certs/ca-bundle.crt

In that case (or in any other case where the "trusted CAs" are in some other file), the installation will need to override the default ca_bundle value using the pbench-server.cfg file. (I don't think that this is particularly weird -- yeah, it would be nicer if the CA bundle location were standard, but, since it's not, I think this defaulting serves us well without being overly restrictive.)

it seems that the CI ran with straight http Elasticsearch so at least that's covered, and I assume this will also work for the staging server.

I concur.

I suppose we may find further problems when the Opensearch permissions are fixed (if that's really the problem), but for now this looks OK.

I'm hoping that the remaining blockers are all DevOps issues and not code (etc.) issues, but I'm inclined to hold off on merging this until we have confirmation of that. However, I've rebased on the end of main to pick up the logging changes.

dbutenhof · 2023-12-04T15:39:56Z

lib/pbench/server/api/resources/__init__.py

@@ -155,7 +155,7 @@ def __str__(self) -> str:
 class SchemaError(APIAbort):
    """Generic base class for errors in processing a JSON schema."""

-    def __init__(self, http_status: int = HTTPStatus.BAD_REQUEST):
+    def __init__(self, http_status: int = HTTPStatus.BAD_REQUEST, **_kwargs):


What's the point of adding an unused kwargs??

It resolved a lint error, probably for ConversionError.__init__() (I don't remember which function specifically -- there are several with similar issues).

Ugh; this is ugly. OK, so ConversionError is designed with **kwargs to allow callers to pass an http_status through the ConversionError constructor to SchemaError ... but if anyone used other kwargs it wouldn't work, and the explicit declaration here is misleading because SchemaError really doesn't support kwargs. (Pass-through or otherwise.)

I'm curious where you're seeing a lint warning, because I'm not. 😦

FYI, the http_status is only specified through ConversionError in convert_username and test_exceptions.py ... I don't see any lint annotations. I'm curious exactly what you're seeing, and this workaround bugs me because it shouldn't be necessary.

Looking at the main branch, I'm getting Unexpected argument on the super().__init__() invocation in DatasetConversionError.__init__(), but, interestingly, I don't get the complaint for the one in ConversionError.__init__(). 🤷

I presume it's because the linter knows that (in the code in the main branch), Schema.__init__() accepts only one argument which, in the case of ConversionError.__init__(), could be in **kwargs, whereas in the case of DatasetConversionError.__init__() it can't/shouldn't be, since the http_status argument has already been specified explicitly, there.

Sure, I guess, because DatasetConversionError passes http_status explicitly and doesn't actually use kwargs. (In fact, it can't, aside from repeating the http_status, because it's not good for anything else.)

I think your linter is broken. But if you really want to quiet its ranting, a better "ugly workaround" would be to add an explicit http_status: Optional[int] = None to ConversionError, pass it on, and drop the kwargs entirely from both ConversionError and DatasetConversionError (which, so far as I can see, is never used anyway... maybe it was at one time).

But again, I don't think there's anything illegal or even unreasonable here, and I think this is just an argument against your linter. 😆

I can't really argue against my linter -- it is more or less built-in to my IDE; and arguing with a linter is, as we know, largely pointless. 🙃

At this point, I'm not inclined to disturb the SchemaError (and children) code any further. Given that the Production deployment seems happy, shall I go ahead and merge this?

I really dislike adding kwargs here, because it makes no sense and is potentially misleading. But, whatever: it'll never be perfect.

dbutenhof · 2023-12-04T20:09:08Z

lib/pbench/server/api/resources/__init__.py

@@ -155,7 +155,7 @@ def __str__(self) -> str:
 class SchemaError(APIAbort):
    """Generic base class for errors in processing a JSON schema."""

-    def __init__(self, http_status: int = HTTPStatus.BAD_REQUEST):
+    def __init__(self, http_status: int = HTTPStatus.BAD_REQUEST, **_kwargs):


I really dislike adding kwargs here, because it makes no sense and is potentially misleading. But, whatever: it'll never be perfect.

webbnh added bug Server labels Nov 17, 2023

webbnh requested a review from dbutenhof November 17, 2023 18:09

webbnh self-assigned this Nov 17, 2023

dbutenhof requested changes Nov 17, 2023

View reviewed changes

lib/pbench/server/indexer.py Outdated Show resolved Hide resolved

webbnh force-pushed the es_update branch 2 times, most recently from f262167 to 02f085b Compare November 20, 2023 18:51

dbutenhof previously approved these changes Nov 20, 2023

View reviewed changes

lib/pbench/common/__init__.py Outdated Show resolved Hide resolved

lib/pbench/common/__init__.py Show resolved Hide resolved

lib/pbench/server/indexer.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/common/test_wait_for_uri.py Outdated Show resolved Hide resolved

webbnh dismissed dbutenhof’s stale review via 3413691 November 20, 2023 19:31

webbnh force-pushed the es_update branch 2 times, most recently from 4bc373e to c9c9781 Compare November 21, 2023 16:07

dbutenhof previously approved these changes Nov 27, 2023

View reviewed changes

lib/pbench/server/indexer.py Outdated Show resolved Hide resolved

webbnh dismissed dbutenhof’s stale review via 74bbf42 November 29, 2023 22:06

webbnh requested a review from dbutenhof November 29, 2023 22:24

webbnh force-pushed the es_update branch from 74bbf42 to 3fe1c51 Compare November 29, 2023 22:34

dbutenhof approved these changes Nov 30, 2023

View reviewed changes

webbnh added 7 commits December 4, 2023 09:58

Pick lint

fcd727f

Correct URL handling for Elasticsearch/OpenSearch

062d40d

Review feedback

2f3671d

Tweaks

d8de2b4

Address functional test failure

1a23098

Pick lint

47640d9

Specify a CA bundle for Elasticsearch client connections and requests

4c5e38b

webbnh force-pushed the es_update branch from 3fe1c51 to 4c5e38b Compare December 4, 2023 15:02

dbutenhof approved these changes Dec 4, 2023

View reviewed changes

webbnh merged commit 34c80f4 into distributed-system-analysis:main Dec 4, 2023
3 checks passed

webbnh deleted the es_update branch December 4, 2023 20:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct URL handling for Elasticsearch/OpenSearch #3577

Correct URL handling for Elasticsearch/OpenSearch #3577

webbnh commented Nov 17, 2023 •

edited

Loading

dbutenhof left a comment

dbutenhof left a comment

webbnh commented Dec 4, 2023

dbutenhof Dec 4, 2023

webbnh Dec 4, 2023

dbutenhof Dec 4, 2023

dbutenhof Dec 4, 2023

webbnh Dec 4, 2023

webbnh Dec 4, 2023 •

edited

Loading

dbutenhof Dec 4, 2023

webbnh Dec 4, 2023

dbutenhof Dec 4, 2023

dbutenhof Dec 4, 2023

Correct URL handling for Elasticsearch/OpenSearch #3577

Correct URL handling for Elasticsearch/OpenSearch #3577

Conversation

webbnh commented Nov 17, 2023 • edited Loading

dbutenhof left a comment

Choose a reason for hiding this comment

dbutenhof left a comment

Choose a reason for hiding this comment

webbnh commented Dec 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

webbnh Dec 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

webbnh commented Nov 17, 2023 •

edited

Loading

webbnh Dec 4, 2023 •

edited

Loading