diff --git a/CHANGELOG.md b/CHANGELOG.md index 72f50b55..460d344e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,8 @@ All notable changes to this project will be documented in this file. - Run a `containerdebug` process in the background of each OPA container to collect debugging information ([#666]). - Added support for OPA `1.0.x` ([#677]) and ([#687]). - Aggregate emitted Kubernetes events on the CustomResources ([#675]). +- Added role level services and discovery configmaps called `-local` with `internalTrafficPolicy` set to `Local` + and `-cluster` with `internalTrafficPolicy` set to `Cluster` ([#688]). ### Removed @@ -23,6 +25,7 @@ All notable changes to this project will be documented in this file. [#675]: https://github.com/stackabletech/opa-operator/pull/675 [#677]: https://github.com/stackabletech/opa-operator/pull/677 [#687]: https://github.com/stackabletech/opa-operator/pull/687 +[#688]: https://github.com/stackabletech/opa-operator/pull/688 ## [24.11.1] - 2025-01-10 diff --git a/docs/modules/opa/pages/implementation-notes.adoc b/docs/modules/opa/pages/implementation-notes.adoc index 9401980d..5c41bcc5 100644 --- a/docs/modules/opa/pages/implementation-notes.adoc +++ b/docs/modules/opa/pages/implementation-notes.adoc @@ -5,11 +5,28 @@ but should not be required reading for regular use. == OPA replica per node +OPA is deployed as a DaemonSet and runs on each Node. The following entrypoint Services are defined: + +=== Local Traffic Policy + OPA runs on each Node to avoid requiring network round trips for services making policy queries (which are often chained in serial, and block other tasks in the products). Local access is ensured via an https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/[`InternalTrafficPolicy`]. This means that https://kubernetes.io/docs/concepts/workloads/pods/[Pods] accessing OPA via the service discovery are routed to the OPA Pod on the same https://kubernetes.io/docs/concepts/architecture/nodes/[Node] to reduce request latency and network traffic. +This should be the default entrypoint and has the same name as the defined OPA cluster. + +If the `metadata.name` is `opa`, this service is called `opa-local`. + +=== Cluster Traffic Policy (round-robin) + +This service is called as the OPA cluster suffixed with `-cluster`. This entrypoint can be used if latency (e.g. no network requests) is less important. +Evaluating complicated rego rules may take some time depending on the provided resources, and can be the limiting factor in e.g. bulk requests. +Therefore, using this service, every Pod in the cluster is utilized to evaluate policies (via round robin). This allows better parallelism when +evaluating policies, but results in network roundtrips. + +If the `metadata.name` is `opa`, this service is called `opa-cluster`. + == OPA Bundle Builder Users can manage policy rules by creating, updating and deleting ConfigMap resources. diff --git a/docs/modules/opa/pages/reference/discovery.adoc b/docs/modules/opa/pages/reference/discovery.adoc index 27342827..b67c8136 100644 --- a/docs/modules/opa/pages/reference/discovery.adoc +++ b/docs/modules/opa/pages/reference/discovery.adoc @@ -26,14 +26,15 @@ metadata: spec: [...] ---- -<1> The name of the OPA cluster, which is also the name of the created discovery ConfigMap. -<2> The namespace of the discovery ConfigMap. +<1> The name of the OPA cluster, which is used in the created discovery ConfigMaps. +<2> The namespace of the discovery ConfigMaps. -The resulting discovery ConfigMap is `{namespace}/{clusterName}`. +Currently, three discovery ConfigMaps are provided. -== Contents +=== (DEPRECATED) Internal Traffic Policy `Local` -The `{namespace}/{clusterName}` discovery ConfigMap contains the following fields where `{clusterName}` represents the name and `{namespace}` the namespace of the cluster: +The discovery ConfigMap `{namespace}/{clusterName}` contains the following fields where `{clusterName}` represents the name and `{namespace}` the namespace of the cluster. +This is deprecated and only kept for backwards compatibitliy. Users are adviced to switch to `{namespace}/{clusterName}-local`, which is the identical replacement. `OPA`:: ==== @@ -49,3 +50,44 @@ In order to query policies you have to configure your product and its OPA URL as [subs="attributes"] http://{clusterName}.{namespace}.svc.cluster.local:8081/v1/data/{packageName}/{policyName} ==== + +=== Internal Traffic Policy `Local` + +The discovery ConfigMap `{namespace}/{clusterName}-local` contains the following fields where `{clusterName}-local` represents the name and `{namespace}` the namespace of the cluster. +Using this discovery service, requests from one Node will always reach the OPA Pod on the same Node. This allows for low latency authorization queriers. + +`OPA`:: +==== +A connection string for cluster internal OPA requests. +Provided the cluster example above, the connection string is created as follows: + +[subs="attributes"] + http://{clusterName}-local.{namespace}.svc.cluster.local:8081/ + +This connection string points to the base URL (and web UI) of the OPA cluster. +In order to query policies you have to configure your product and its OPA URL as follows, given the bundle package name `{packageName}` and the policy name `{policyName}`: + +[subs="attributes"] + http://{clusterName}-local.{namespace}.svc.cluster.local:8081/v1/data/{packageName}/{policyName} +==== + +=== Internal Traffic Policy `Cluster` + +The discovery ConfigMap `{namespace}/{clusterName}-cluster` contains the following fields where `{clusterName}-cluster` represents the name and `{namespace}` the namespace of the cluster. +Using this discovery service, requests to OPA are distributed over all available OPA Pods, improving parallelism when evaluating policies but slightly increasing the latency of each single query +to due additional network requests. + +`OPA`:: +==== +A connection string for cluster internal OPA requests. +Provided the cluster example above, the connection string is created as follows: + +[subs="attributes"] + http://{clusterName}-cluster.{namespace}.svc.cluster.local:8081/ + +This connection string points to the base URL (and web UI) of the OPA cluster. +In order to query policies you have to configure your product and its OPA URL as follows, given the bundle package name `{packageName}` and the policy name `{policyName}`: + +[subs="attributes"] + http://{clusterName}-cluster.{namespace}.svc.cluster.local:8081/v1/data/{packageName}/{policyName} +==== diff --git a/rust/operator-binary/src/controller.rs b/rust/operator-binary/src/controller.rs index 1f1d9458..70451cea 100644 --- a/rust/operator-binary/src/controller.rs +++ b/rust/operator-binary/src/controller.rs @@ -38,7 +38,7 @@ use stackable_operator::{ apps::v1::{DaemonSet, DaemonSetSpec}, core::v1::{ ConfigMap, EmptyDirVolumeSource, EnvVar, HTTPGetAction, Probe, SecretVolumeSource, - Service, ServiceAccount, ServicePort, ServiceSpec, + ServiceAccount, }, }, apimachinery::pkg::{apis::meta::v1::LabelSelector, util::intstr::IntOrString}, @@ -49,7 +49,7 @@ use stackable_operator::{ runtime::{controller::Action, reflector::ObjectRef}, Resource as KubeResource, ResourceExt, }, - kvp::{Label, LabelError, Labels, ObjectLabels}, + kvp::{LabelError, Labels, ObjectLabels}, logging::controller::ReconcilerError, memory::{BinaryMultiple, MemoryQuantity}, product_config_utils::{transform_all_roles_to_config, validate_all_roles_and_groups_config}, @@ -74,11 +74,12 @@ use stackable_operator::{ use strum::{EnumDiscriminants, IntoStaticStr}; use crate::{ - discovery::{self, build_discovery_configmaps}, + discovery::{self, build_discovery_configmap}, operations::graceful_shutdown::add_graceful_shutdown_config, product_logging::{ extend_role_group_config_map, resolve_vector_aggregator_address, BundleBuilderLogLevel, }, + service::{build_discoverable_services, build_rolegroup_service, ServiceConfig}, }; pub const OPA_CONTROLLER_NAME: &str = "opacluster"; @@ -157,9 +158,6 @@ pub enum Error { source: error_boundary::InvalidObject, }, - #[snafu(display("object does not define meta name"))] - NoName, - #[snafu(display("internal operator failure"))] InternalOperatorFailure { source: stackable_opa_operator::crd::Error, @@ -318,6 +316,9 @@ pub enum Error { AddVolumeMount { source: builder::pod::container::Error, }, + + #[snafu(display("failed to build required services"))] + BuildRequiredServices { source: crate::service::Error }, } type Result = std::result::Result; @@ -447,12 +448,58 @@ pub async fn reconcile_opa( .await .context(ResolveVectorAggregatorAddressSnafu)?; - let server_role_service = build_server_role_service(opa, &resolved_product_image)?; - // required for discovery config map later - let server_role_service = cluster_resources - .add(client, server_role_service) - .await - .context(ApplyRoleServiceSnafu)?; + let required_services = vec![ + // The server-role service is the primary endpoint that should be used by clients that do + // require local access - deprecated, kept for downwards compatibility + ServiceConfig { + name: opa + .server_role_service_name_itp_local_deprecated() + .context(RoleServiceNameNotFoundSnafu)?, + internal_traffic_policy: "Local".to_string(), + }, + // The server-role service is the primary endpoint that should be used by clients that do + // require local access + ServiceConfig { + name: opa + .server_role_service_name_itp_local() + .context(RoleServiceNameNotFoundSnafu)?, + internal_traffic_policy: "Local".to_string(), + }, + // The server-role service is the primary endpoint that should be used by clients that do + // perform internal round robin + ServiceConfig { + name: opa + .server_role_service_name_itp_cluster() + .context(RoleServiceNameNotFoundSnafu)?, + internal_traffic_policy: "Cluster".to_string(), + }, + ]; + + let services = build_discoverable_services(opa, &resolved_product_image, required_services) + .context(BuildRequiredServicesSnafu)?; + + for svc in services { + // required for discovery config map later + let role_service = cluster_resources + .add(client, svc) + .await + .context(ApplyRoleServiceSnafu)?; + + let discovery_cm = build_discovery_configmap( + &role_service.name_any(), + opa, + opa, + &resolved_product_image, + &role_service, + &client.kubernetes_cluster_info, + ) + .context(BuildDiscoveryConfigSnafu)?; + + cluster_resources + .add(client, discovery_cm) + .await + .context(ApplyDiscoveryConfigSnafu)?; + } let required_labels = cluster_resources .get_required_labels() @@ -490,7 +537,8 @@ pub async fn reconcile_opa( &merged_config, vector_aggregator_address.as_deref(), )?; - let rg_service = build_rolegroup_service(opa, &resolved_product_image, &rolegroup)?; + let rg_service = build_rolegroup_service(opa, &resolved_product_image, &rolegroup) + .context(BuildRequiredServicesSnafu)?; let rg_daemonset = build_server_rolegroup_daemonset( opa, &resolved_product_image, @@ -546,21 +594,6 @@ pub async fn reconcile_opa( .context(ApplyPatchRoleGroupDaemonSetSnafu { rolegroup })?; } - for discovery_cm in build_discovery_configmaps( - opa, - opa, - &resolved_product_image, - &server_role_service, - &client.kubernetes_cluster_info, - ) - .context(BuildDiscoveryConfigSnafu)? - { - cluster_resources - .add(client, discovery_cm) - .await - .context(ApplyDiscoveryConfigSnafu)?; - } - let cluster_operation_cond_builder = ClusterOperationsConditionBuilder::new(&opa.spec.cluster_operation); @@ -581,101 +614,6 @@ pub async fn reconcile_opa( Ok(Action::await_change()) } -/// The server-role service is the primary endpoint that should be used by clients that do not perform internal load balancing, -/// including targets outside of the cluster. -pub fn build_server_role_service( - opa: &v1alpha1::OpaCluster, - resolved_product_image: &ResolvedProductImage, -) -> Result { - let role_name = v1alpha1::OpaRole::Server.to_string(); - let role_svc_name = opa - .server_role_service_name() - .context(RoleServiceNameNotFoundSnafu)?; - - let metadata = ObjectMetaBuilder::new() - .name_and_namespace(opa) - .name(&role_svc_name) - .ownerreference_from_resource(opa, None, Some(true)) - .context(ObjectMissingMetadataForOwnerRefSnafu)? - .with_recommended_labels(build_recommended_labels( - opa, - &resolved_product_image.app_version_label, - &role_name, - "global", - )) - .context(ObjectMetaSnafu)? - .build(); - - let service_selector_labels = - Labels::role_selector(opa, APP_NAME, &role_name).context(BuildLabelSnafu)?; - - let service_spec = ServiceSpec { - type_: Some(opa.spec.cluster_config.listener_class.k8s_service_type()), - ports: Some(vec![ServicePort { - name: Some(APP_PORT_NAME.to_string()), - port: APP_PORT.into(), - protocol: Some("TCP".to_string()), - ..ServicePort::default() - }]), - selector: Some(service_selector_labels.into()), - internal_traffic_policy: Some("Local".to_string()), - ..ServiceSpec::default() - }; - - Ok(Service { - metadata, - spec: Some(service_spec), - status: None, - }) -} - -/// The rolegroup [`Service`] is a headless service that allows direct access to the instances of a certain rolegroup -/// -/// This is mostly useful for internal communication between peers, or for clients that perform client-side load balancing. -fn build_rolegroup_service( - opa: &v1alpha1::OpaCluster, - resolved_product_image: &ResolvedProductImage, - rolegroup: &RoleGroupRef, -) -> Result { - let prometheus_label = - Label::try_from(("prometheus.io/scrape", "true")).context(BuildLabelSnafu)?; - - let metadata = ObjectMetaBuilder::new() - .name_and_namespace(opa) - .name(rolegroup.object_name()) - .ownerreference_from_resource(opa, None, Some(true)) - .context(ObjectMissingMetadataForOwnerRefSnafu)? - .with_recommended_labels(build_recommended_labels( - opa, - &resolved_product_image.app_version_label, - &rolegroup.role, - &rolegroup.role_group, - )) - .context(ObjectMetaSnafu)? - .with_label(prometheus_label) - .build(); - - let service_selector_labels = - Labels::role_group_selector(opa, APP_NAME, &rolegroup.role, &rolegroup.role_group) - .context(BuildLabelSnafu)?; - - let service_spec = ServiceSpec { - // Internal communication does not need to be exposed - type_: Some("ClusterIP".to_string()), - cluster_ip: Some("None".to_string()), - ports: Some(service_ports()), - selector: Some(service_selector_labels.into()), - publish_not_ready_addresses: Some(true), - ..ServiceSpec::default() - }; - - Ok(Service { - metadata, - spec: Some(service_spec), - status: None, - }) -} - /// The rolegroup [`ConfigMap`] configures the rolegroup based on the configuration given by the administrator fn build_server_rolegroup_config_map( opa: &v1alpha1::OpaCluster, @@ -1299,24 +1237,6 @@ fn build_prepare_start_command( prepare_container_args } -fn service_ports() -> Vec { - vec![ - ServicePort { - name: Some(APP_PORT_NAME.to_string()), - port: APP_PORT.into(), - protocol: Some("TCP".to_string()), - ..ServicePort::default() - }, - ServicePort { - name: Some(METRICS_PORT_NAME.to_string()), - port: 9504, // Arbitrary port number, this is never actually used anywhere - protocol: Some("TCP".to_string()), - target_port: Some(IntOrString::String(APP_PORT_NAME.to_string())), - ..ServicePort::default() - }, - ] -} - /// Creates recommended `ObjectLabels` to be used in deployed resources pub fn build_recommended_labels<'a, T>( owner: &'a T, diff --git a/rust/operator-binary/src/crd/mod.rs b/rust/operator-binary/src/crd/mod.rs index 0c979979..2f68a3bb 100644 --- a/rust/operator-binary/src/crd/mod.rs +++ b/rust/operator-binary/src/crd/mod.rs @@ -319,16 +319,32 @@ impl v1alpha1::OpaCluster { }) } - /// The name of the role-level load-balanced Kubernetes `Service` - pub fn server_role_service_name(&self) -> Option { + /// DEPRECATED: The name of the role-level traffic policy local Kubernetes `Service` + pub fn server_role_service_name_itp_local_deprecated(&self) -> Option { self.metadata.name.clone() } - /// The fully-qualified domain name of the role-level load-balanced Kubernetes `Service` + /// The name of the role-level traffic policy cluster Kubernetes `Service` + pub fn server_role_service_name_itp_local(&self) -> Option { + if let Some(service_name) = &self.metadata.name { + return Some(format!("{service_name}-local")); + } + None + } + + /// The name of the role-level traffic policy cluster Kubernetes `Service` + pub fn server_role_service_name_itp_cluster(&self) -> Option { + if let Some(service_name) = &self.metadata.name { + return Some(format!("{service_name}-cluster")); + } + None + } + + /// The fully-qualified domain name of the role-level local Kubernetes `Service` pub fn server_role_service_fqdn(&self, cluster_info: &KubernetesClusterInfo) -> Option { Some(format!( "{role_service_name}.{namespace}.svc.{cluster_domain}", - role_service_name = self.server_role_service_name()?, + role_service_name = self.server_role_service_name_itp_local_deprecated()?, namespace = self.metadata.namespace.as_ref()?, cluster_domain = cluster_info.cluster_domain )) diff --git a/rust/operator-binary/src/discovery.rs b/rust/operator-binary/src/discovery.rs index 1e7f7af4..b5ed4ef4 100644 --- a/rust/operator-binary/src/discovery.rs +++ b/rust/operator-binary/src/discovery.rs @@ -4,7 +4,7 @@ use stackable_operator::{ builder::{configmap::ConfigMapBuilder, meta::ObjectMetaBuilder}, commons::product_image_selection::ResolvedProductImage, k8s_openapi::api::core::v1::{ConfigMap, Service}, - kube::{runtime::reflector::ObjectRef, Resource, ResourceExt}, + kube::{runtime::reflector::ObjectRef, Resource}, utils::cluster_info::KubernetesClusterInfo, }; @@ -35,27 +35,8 @@ pub enum Error { }, } -/// Builds discovery [`ConfigMap`]s for connecting to a [`v1alpha1::OpaCluster`] for all expected scenarios -pub fn build_discovery_configmaps( - owner: &impl Resource, - opa: &v1alpha1::OpaCluster, - resolved_product_image: &ResolvedProductImage, - svc: &Service, - cluster_info: &KubernetesClusterInfo, -) -> Result, Error> { - let name = owner.name_any(); - Ok(vec![build_discovery_configmap( - &name, - owner, - opa, - resolved_product_image, - svc, - cluster_info, - )?]) -} - /// Build a discovery [`ConfigMap`] containing information about how to connect to a certain [`v1alpha1::OpaCluster`] -fn build_discovery_configmap( +pub fn build_discovery_configmap( name: &str, owner: &impl Resource, opa: &v1alpha1::OpaCluster, diff --git a/rust/operator-binary/src/main.rs b/rust/operator-binary/src/main.rs index 45b79690..5ad7b972 100644 --- a/rust/operator-binary/src/main.rs +++ b/rust/operator-binary/src/main.rs @@ -31,6 +31,7 @@ mod controller; mod discovery; mod operations; mod product_logging; +mod service; pub mod built_info { include!(concat!(env!("OUT_DIR"), "/built.rs")); diff --git a/rust/operator-binary/src/service.rs b/rust/operator-binary/src/service.rs new file mode 100644 index 00000000..b065ba38 --- /dev/null +++ b/rust/operator-binary/src/service.rs @@ -0,0 +1,170 @@ +use snafu::{ResultExt, Snafu}; +use stackable_opa_operator::crd::{v1alpha1, APP_NAME}; +use stackable_operator::{ + builder::meta::ObjectMetaBuilder, + commons::product_image_selection::ResolvedProductImage, + k8s_openapi::{ + api::core::v1::{Service, ServicePort, ServiceSpec}, + apimachinery::pkg::util::intstr::IntOrString, + }, + kube::runtime::reflector::ObjectRef, + kvp::{Label, LabelError, Labels}, + role_utils::RoleGroupRef, +}; + +use crate::controller::{build_recommended_labels, APP_PORT, APP_PORT_NAME, METRICS_PORT_NAME}; + +type Result = std::result::Result; + +#[derive(Snafu, Debug)] +pub enum Error { + #[snafu(display("object {opa} is missing metadata to build owner reference"))] + ObjectMissingMetadataForOwnerRef { + source: stackable_operator::builder::meta::Error, + opa: ObjectRef, + }, + + #[snafu(display("object has no name associated"))] + NoName, + + #[snafu(display("failed to build object meta data"))] + ObjectMeta { + source: stackable_operator::builder::meta::Error, + }, + + #[snafu(display("failed to build label"))] + BuildLabel { source: LabelError }, +} + +pub struct ServiceConfig { + pub name: String, + pub internal_traffic_policy: String, +} + +pub fn build_discoverable_services( + opa: &v1alpha1::OpaCluster, + resolved_product_image: &ResolvedProductImage, + service_configs: Vec, +) -> Result> { + let mut services = vec![]; + + // discoverable role services + for sc in service_configs { + services.push(build_server_role_service(opa, resolved_product_image, sc)?); + } + + Ok(services) +} + +fn build_server_role_service( + opa: &v1alpha1::OpaCluster, + resolved_product_image: &ResolvedProductImage, + service_config: ServiceConfig, +) -> Result { + let role_name = v1alpha1::OpaRole::Server.to_string(); + + let metadata = ObjectMetaBuilder::new() + .name_and_namespace(opa) + .name(service_config.name) + .ownerreference_from_resource(opa, None, Some(true)) + .context(ObjectMissingMetadataForOwnerRefSnafu { + opa: ObjectRef::from_obj(opa), + })? + .with_recommended_labels(build_recommended_labels( + opa, + &resolved_product_image.app_version_label, + &role_name, + "global", + )) + .context(ObjectMetaSnafu)? + .build(); + + let service_selector_labels = + Labels::role_selector(opa, APP_NAME, &role_name).context(BuildLabelSnafu)?; + + let service_spec = ServiceSpec { + type_: Some(opa.spec.cluster_config.listener_class.k8s_service_type()), + ports: Some(vec![ServicePort { + name: Some(APP_PORT_NAME.to_string()), + port: APP_PORT.into(), + protocol: Some("TCP".to_string()), + ..ServicePort::default() + }]), + selector: Some(service_selector_labels.into()), + internal_traffic_policy: Some(service_config.internal_traffic_policy), + ..ServiceSpec::default() + }; + + Ok(Service { + metadata, + spec: Some(service_spec), + status: None, + }) +} + +/// The rolegroup [`Service`] is a headless service that allows direct access to the instances of a certain rolegroup +/// +/// This is mostly useful for internal communication between peers, or for clients that perform client-side load balancing. +pub fn build_rolegroup_service( + opa: &v1alpha1::OpaCluster, + resolved_product_image: &ResolvedProductImage, + rolegroup: &RoleGroupRef, +) -> Result { + let prometheus_label = + Label::try_from(("prometheus.io/scrape", "true")).context(BuildLabelSnafu)?; + + let metadata = ObjectMetaBuilder::new() + .name_and_namespace(opa) + .name(rolegroup.object_name()) + .ownerreference_from_resource(opa, None, Some(true)) + .context(ObjectMissingMetadataForOwnerRefSnafu { + opa: ObjectRef::from_obj(opa), + })? + .with_recommended_labels(build_recommended_labels( + opa, + &resolved_product_image.app_version_label, + &rolegroup.role, + &rolegroup.role_group, + )) + .context(ObjectMetaSnafu)? + .with_label(prometheus_label) + .build(); + + let service_selector_labels = + Labels::role_group_selector(opa, APP_NAME, &rolegroup.role, &rolegroup.role_group) + .context(BuildLabelSnafu)?; + + let service_spec = ServiceSpec { + // Internal communication does not need to be exposed + type_: Some("ClusterIP".to_string()), + cluster_ip: Some("None".to_string()), + ports: Some(service_ports()), + selector: Some(service_selector_labels.into()), + publish_not_ready_addresses: Some(true), + ..ServiceSpec::default() + }; + + Ok(Service { + metadata, + spec: Some(service_spec), + status: None, + }) +} + +fn service_ports() -> Vec { + vec![ + ServicePort { + name: Some(APP_PORT_NAME.to_string()), + port: APP_PORT.into(), + protocol: Some("TCP".to_string()), + ..ServicePort::default() + }, + ServicePort { + name: Some(METRICS_PORT_NAME.to_string()), + port: 9504, // Arbitrary port number, this is never actually used anywhere + protocol: Some("TCP".to_string()), + target_port: Some(IntOrString::String(APP_PORT_NAME.to_string())), + ..ServicePort::default() + }, + ] +}