Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add internal trafific policy 'Cluster' role service and discovery CM #688

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ All notable changes to this project will be documented in this file.
- Run a `containerdebug` process in the background of each OPA container to collect debugging information ([#666]).
- Added support for OPA `1.0.x` ([#677]) and ([#687]).
- Aggregate emitted Kubernetes events on the CustomResources ([#675]).
- Added role level services and discovery configmaps called `<cluster-name>-local` with `internalTrafficPolicy` set to `Local`
and `<cluster-name>-cluster` with `internalTrafficPolicy` set to `Cluster` ([#688]).

### Removed

Expand All @@ -23,6 +25,7 @@ All notable changes to this project will be documented in this file.
[#675]: https://github.com/stackabletech/opa-operator/pull/675
[#677]: https://github.com/stackabletech/opa-operator/pull/677
[#687]: https://github.com/stackabletech/opa-operator/pull/687
[#688]: https://github.com/stackabletech/opa-operator/pull/688

## [24.11.1] - 2025-01-10

Expand Down
17 changes: 17 additions & 0 deletions docs/modules/opa/pages/implementation-notes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,28 @@ but should not be required reading for regular use.

== OPA replica per node

OPA is deployed as a DaemonSet and runs on each Node. The following entrypoint Services are defined:

=== Local Traffic Policy

OPA runs on each Node to avoid requiring network round trips for services making policy queries (which are often chained in serial, and block other tasks in the products).

Local access is ensured via an https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/[`InternalTrafficPolicy`].
This means that https://kubernetes.io/docs/concepts/workloads/pods/[Pods] accessing OPA via the service discovery are routed to the OPA Pod on the same https://kubernetes.io/docs/concepts/architecture/nodes/[Node] to reduce request latency and network traffic.

This should be the default entrypoint and has the same name as the defined OPA cluster.

If the `metadata.name` is `opa`, this service is called `opa-local`.

=== Cluster Traffic Policy (round-robin)

This service is called as the OPA cluster suffixed with `-cluster`. This entrypoint can be used if latency (e.g. no network requests) is less important.
Evaluating complicated rego rules may take some time depending on the provided resources, and can be the limiting factor in e.g. bulk requests.
Therefore, using this service, every Pod in the cluster is utilized to evaluate policies (via round robin). This allows better parallelism when
evaluating policies, but results in network roundtrips.

If the `metadata.name` is `opa`, this service is called `opa-cluster`.

== OPA Bundle Builder

Users can manage policy rules by creating, updating and deleting ConfigMap resources.
Expand Down
52 changes: 47 additions & 5 deletions docs/modules/opa/pages/reference/discovery.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,15 @@ metadata:
spec:
[...]
----
<1> The name of the OPA cluster, which is also the name of the created discovery ConfigMap.
<2> The namespace of the discovery ConfigMap.
<1> The name of the OPA cluster, which is used in the created discovery ConfigMaps.
<2> The namespace of the discovery ConfigMaps.

The resulting discovery ConfigMap is `{namespace}/{clusterName}`.
Currently, three discovery ConfigMaps are provided.

== Contents
=== (DEPRECATED) Internal Traffic Policy `Local`

The `{namespace}/{clusterName}` discovery ConfigMap contains the following fields where `{clusterName}` represents the name and `{namespace}` the namespace of the cluster:
The discovery ConfigMap `{namespace}/{clusterName}` contains the following fields where `{clusterName}` represents the name and `{namespace}` the namespace of the cluster.
This is deprecated and only kept for backwards compatibitliy. Users are adviced to switch to `{namespace}/{clusterName}-local`, which is the identical replacement.

`OPA`::
====
Expand All @@ -49,3 +50,44 @@ In order to query policies you have to configure your product and its OPA URL as
[subs="attributes"]
http://{clusterName}.{namespace}.svc.cluster.local:8081/v1/data/{packageName}/{policyName}
====

=== Internal Traffic Policy `Local`

The discovery ConfigMap `{namespace}/{clusterName}-local` contains the following fields where `{clusterName}-local` represents the name and `{namespace}` the namespace of the cluster.
Using this discovery service, requests from one Node will always reach the OPA Pod on the same Node. This allows for low latency authorization queriers.

`OPA`::
====
A connection string for cluster internal OPA requests.
Provided the cluster example above, the connection string is created as follows:

[subs="attributes"]
http://{clusterName}-local.{namespace}.svc.cluster.local:8081/

This connection string points to the base URL (and web UI) of the OPA cluster.
In order to query policies you have to configure your product and its OPA URL as follows, given the bundle package name `{packageName}` and the policy name `{policyName}`:

[subs="attributes"]
http://{clusterName}-local.{namespace}.svc.cluster.local:8081/v1/data/{packageName}/{policyName}
====

=== Internal Traffic Policy `Cluster`

The discovery ConfigMap `{namespace}/{clusterName}-cluster` contains the following fields where `{clusterName}-cluster` represents the name and `{namespace}` the namespace of the cluster.
Using this discovery service, requests to OPA are distributed over all available OPA Pods, improving parallelism when evaluating policies but slightly increasing the latency of each single query
to due additional network requests.

`OPA`::
====
A connection string for cluster internal OPA requests.
Provided the cluster example above, the connection string is created as follows:

[subs="attributes"]
http://{clusterName}-cluster.{namespace}.svc.cluster.local:8081/

This connection string points to the base URL (and web UI) of the OPA cluster.
In order to query policies you have to configure your product and its OPA URL as follows, given the bundle package name `{packageName}` and the policy name `{policyName}`:

[subs="attributes"]
http://{clusterName}-cluster.{namespace}.svc.cluster.local:8081/v1/data/{packageName}/{policyName}
====
202 changes: 61 additions & 141 deletions rust/operator-binary/src/controller.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ use stackable_operator::{
apps::v1::{DaemonSet, DaemonSetSpec},
core::v1::{
ConfigMap, EmptyDirVolumeSource, EnvVar, HTTPGetAction, Probe, SecretVolumeSource,
Service, ServiceAccount, ServicePort, ServiceSpec,
ServiceAccount,
},
},
apimachinery::pkg::{apis::meta::v1::LabelSelector, util::intstr::IntOrString},
Expand All @@ -49,7 +49,7 @@ use stackable_operator::{
runtime::{controller::Action, reflector::ObjectRef},
Resource as KubeResource, ResourceExt,
},
kvp::{Label, LabelError, Labels, ObjectLabels},
kvp::{LabelError, Labels, ObjectLabels},
logging::controller::ReconcilerError,
memory::{BinaryMultiple, MemoryQuantity},
product_config_utils::{transform_all_roles_to_config, validate_all_roles_and_groups_config},
Expand All @@ -74,11 +74,12 @@ use stackable_operator::{
use strum::{EnumDiscriminants, IntoStaticStr};

use crate::{
discovery::{self, build_discovery_configmaps},
discovery::{self, build_discovery_configmap},
operations::graceful_shutdown::add_graceful_shutdown_config,
product_logging::{
extend_role_group_config_map, resolve_vector_aggregator_address, BundleBuilderLogLevel,
},
service::{build_discoverable_services, build_rolegroup_service, ServiceConfig},
};

pub const OPA_CONTROLLER_NAME: &str = "opacluster";
Expand Down Expand Up @@ -157,9 +158,6 @@ pub enum Error {
source: error_boundary::InvalidObject,
},

#[snafu(display("object does not define meta name"))]
NoName,

#[snafu(display("internal operator failure"))]
InternalOperatorFailure {
source: stackable_opa_operator::crd::Error,
Expand Down Expand Up @@ -318,6 +316,9 @@ pub enum Error {
AddVolumeMount {
source: builder::pod::container::Error,
},

#[snafu(display("failed to build required services"))]
BuildRequiredServices { source: crate::service::Error },
}
type Result<T, E = Error> = std::result::Result<T, E>;

Expand Down Expand Up @@ -447,12 +448,58 @@ pub async fn reconcile_opa(
.await
.context(ResolveVectorAggregatorAddressSnafu)?;

let server_role_service = build_server_role_service(opa, &resolved_product_image)?;
// required for discovery config map later
let server_role_service = cluster_resources
.add(client, server_role_service)
.await
.context(ApplyRoleServiceSnafu)?;
let required_services = vec![
// The server-role service is the primary endpoint that should be used by clients that do
// require local access - deprecated, kept for downwards compatibility
ServiceConfig {
name: opa
.server_role_service_name_itp_local_deprecated()
.context(RoleServiceNameNotFoundSnafu)?,
internal_traffic_policy: "Local".to_string(),
},
// The server-role service is the primary endpoint that should be used by clients that do
// require local access
ServiceConfig {
name: opa
.server_role_service_name_itp_local()
.context(RoleServiceNameNotFoundSnafu)?,
internal_traffic_policy: "Local".to_string(),
},
// The server-role service is the primary endpoint that should be used by clients that do
// perform internal round robin
ServiceConfig {
name: opa
.server_role_service_name_itp_cluster()
.context(RoleServiceNameNotFoundSnafu)?,
internal_traffic_policy: "Cluster".to_string(),
},
];

let services = build_discoverable_services(opa, &resolved_product_image, required_services)
.context(BuildRequiredServicesSnafu)?;

for svc in services {
// required for discovery config map later
let role_service = cluster_resources
.add(client, svc)
.await
.context(ApplyRoleServiceSnafu)?;

let discovery_cm = build_discovery_configmap(
&role_service.name_any(),
opa,
opa,
&resolved_product_image,
&role_service,
&client.kubernetes_cluster_info,
)
.context(BuildDiscoveryConfigSnafu)?;

cluster_resources
.add(client, discovery_cm)
.await
.context(ApplyDiscoveryConfigSnafu)?;
}

let required_labels = cluster_resources
.get_required_labels()
Expand Down Expand Up @@ -490,7 +537,8 @@ pub async fn reconcile_opa(
&merged_config,
vector_aggregator_address.as_deref(),
)?;
let rg_service = build_rolegroup_service(opa, &resolved_product_image, &rolegroup)?;
let rg_service = build_rolegroup_service(opa, &resolved_product_image, &rolegroup)
.context(BuildRequiredServicesSnafu)?;
let rg_daemonset = build_server_rolegroup_daemonset(
opa,
&resolved_product_image,
Expand Down Expand Up @@ -546,21 +594,6 @@ pub async fn reconcile_opa(
.context(ApplyPatchRoleGroupDaemonSetSnafu { rolegroup })?;
}

for discovery_cm in build_discovery_configmaps(
opa,
opa,
&resolved_product_image,
&server_role_service,
&client.kubernetes_cluster_info,
)
.context(BuildDiscoveryConfigSnafu)?
{
cluster_resources
.add(client, discovery_cm)
.await
.context(ApplyDiscoveryConfigSnafu)?;
}

let cluster_operation_cond_builder =
ClusterOperationsConditionBuilder::new(&opa.spec.cluster_operation);

Expand All @@ -581,101 +614,6 @@ pub async fn reconcile_opa(
Ok(Action::await_change())
}

/// The server-role service is the primary endpoint that should be used by clients that do not perform internal load balancing,
/// including targets outside of the cluster.
pub fn build_server_role_service(
opa: &v1alpha1::OpaCluster,
resolved_product_image: &ResolvedProductImage,
) -> Result<Service> {
let role_name = v1alpha1::OpaRole::Server.to_string();
let role_svc_name = opa
.server_role_service_name()
.context(RoleServiceNameNotFoundSnafu)?;

let metadata = ObjectMetaBuilder::new()
.name_and_namespace(opa)
.name(&role_svc_name)
.ownerreference_from_resource(opa, None, Some(true))
.context(ObjectMissingMetadataForOwnerRefSnafu)?
.with_recommended_labels(build_recommended_labels(
opa,
&resolved_product_image.app_version_label,
&role_name,
"global",
))
.context(ObjectMetaSnafu)?
.build();

let service_selector_labels =
Labels::role_selector(opa, APP_NAME, &role_name).context(BuildLabelSnafu)?;

let service_spec = ServiceSpec {
type_: Some(opa.spec.cluster_config.listener_class.k8s_service_type()),
ports: Some(vec![ServicePort {
name: Some(APP_PORT_NAME.to_string()),
port: APP_PORT.into(),
protocol: Some("TCP".to_string()),
..ServicePort::default()
}]),
selector: Some(service_selector_labels.into()),
internal_traffic_policy: Some("Local".to_string()),
..ServiceSpec::default()
};

Ok(Service {
metadata,
spec: Some(service_spec),
status: None,
})
}

/// The rolegroup [`Service`] is a headless service that allows direct access to the instances of a certain rolegroup
///
/// This is mostly useful for internal communication between peers, or for clients that perform client-side load balancing.
fn build_rolegroup_service(
opa: &v1alpha1::OpaCluster,
resolved_product_image: &ResolvedProductImage,
rolegroup: &RoleGroupRef<v1alpha1::OpaCluster>,
) -> Result<Service> {
let prometheus_label =
Label::try_from(("prometheus.io/scrape", "true")).context(BuildLabelSnafu)?;

let metadata = ObjectMetaBuilder::new()
.name_and_namespace(opa)
.name(rolegroup.object_name())
.ownerreference_from_resource(opa, None, Some(true))
.context(ObjectMissingMetadataForOwnerRefSnafu)?
.with_recommended_labels(build_recommended_labels(
opa,
&resolved_product_image.app_version_label,
&rolegroup.role,
&rolegroup.role_group,
))
.context(ObjectMetaSnafu)?
.with_label(prometheus_label)
.build();

let service_selector_labels =
Labels::role_group_selector(opa, APP_NAME, &rolegroup.role, &rolegroup.role_group)
.context(BuildLabelSnafu)?;

let service_spec = ServiceSpec {
// Internal communication does not need to be exposed
type_: Some("ClusterIP".to_string()),
cluster_ip: Some("None".to_string()),
ports: Some(service_ports()),
selector: Some(service_selector_labels.into()),
publish_not_ready_addresses: Some(true),
..ServiceSpec::default()
};

Ok(Service {
metadata,
spec: Some(service_spec),
status: None,
})
}

/// The rolegroup [`ConfigMap`] configures the rolegroup based on the configuration given by the administrator
fn build_server_rolegroup_config_map(
opa: &v1alpha1::OpaCluster,
Expand Down Expand Up @@ -1299,24 +1237,6 @@ fn build_prepare_start_command(
prepare_container_args
}

fn service_ports() -> Vec<ServicePort> {
vec![
ServicePort {
name: Some(APP_PORT_NAME.to_string()),
port: APP_PORT.into(),
protocol: Some("TCP".to_string()),
..ServicePort::default()
},
ServicePort {
name: Some(METRICS_PORT_NAME.to_string()),
port: 9504, // Arbitrary port number, this is never actually used anywhere
protocol: Some("TCP".to_string()),
target_port: Some(IntOrString::String(APP_PORT_NAME.to_string())),
..ServicePort::default()
},
]
}

/// Creates recommended `ObjectLabels` to be used in deployed resources
pub fn build_recommended_labels<'a, T>(
owner: &'a T,
Expand Down
Loading
Loading