Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ODH Operator not respecting openshift cluster-proxy #86

Closed
mediocrematt opened this issue Jan 28, 2021 · 5 comments
Closed

ODH Operator not respecting openshift cluster-proxy #86

mediocrematt opened this issue Jan 28, 2021 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@mediocrematt
Copy link

Describe the bug
After deploying the ODH Operator from the OperatorHub in OCP (4.6.8), deploying an Open Data Hub KfDef fails to pull the manifests and kf-manifests when behind an enterprise Proxy with a custom certificate authority. The proxy and trusted-CA are both defined in the global configuration cluster-proxy.

To Reproduce
Steps to reproduce the behavior:

  1. Install the ODH 0.9.0 Operator from OperatorHub to -n openshift-operators
  2. Create an ODH Namespace/Project
  3. Deploy a standard KfDef template without modifications to -n odh.
  4. The ODH Operator will have the following in the error logs unable to pull from GitHub.

$ oc logs opendatahub-operator-7cf7cb66fb-5gmzn -n openshift-operators
...
time="2021-01-28T16:59:02Z" level=info msg="Creating directory /tmp/odh/opendatahub/.cache"
time="2021-01-28T16:59:02Z" level=info msg="Fetching https://github.com/opendatahub-io/manifests/tarball/master to /tmp/odh/opendatahub/.cache/kf-manifests"
time="2021-01-28T17:01:14Z" level=error msg="failed to build kfApp from URI /tmp/odh/opendatahub/config.yaml: Error: couldn't generate KfApp: (kubeflow.error): Code 500 with message: could not sync cache. Error: (kubeflow.error): Code 400 with message: couldn't download URI https://github.com/opendatahub-io/manifests/tarball/master: Get https://github.com/opendatahub-io/manifests/tarball/master: dial tcp 140.82.112.4:443: connect: connection timed out."
time="2021-01-28T17:01:14Z" level=error msg="Failed to load KfApp. Error: couldn't generate KfApp: (kubeflow.error): Code 500 with message: could not sync cache. Error: (kubeflow.error): Code 400 with message: couldn't download URI https://github.com/opendatahub-io/manifests/tarball/master: Get https://github.com/opendatahub-io/manifests/tarball/master: dial tcp 140.82.112.4:443: connect: connection timed out."

Expected behavior
Expected that the ODH Operator will use cluster-proxy settings to reach the Internet through Proxy / custom-CA.

Additional context
Current work around is to download the manifests and rehost temporarily on a non-HTTPS system internally.

@mediocrematt mediocrematt added the bug Something isn't working label Jan 28, 2021
@nakfour nakfour added enhancement New feature or request and removed bug Something isn't working human_intervention_required labels Feb 4, 2021
@mediocrematt
Copy link
Author

I just tested ODH 1.0 Operator, and this is still present. There is also regression in the JupyterHub login through openshift-oauth, where it also is no longer respecting cluster-proxy when passing back to JH:

500 : Internal Server Error
CERTIFICATE_VERIFY_FAILED

This was not an issue on 0.9.0.

@akchinSTC
Copy link

@vpavlin - any updates on this issue or possible workarounds?

@nakfour
Copy link

nakfour commented Jun 18, 2021

we are migrating to https://issues.redhat.com/projects/ODH/summary can you please move all your open issues there.

@shalberd
Copy link

shalberd commented Sep 2, 2022

@LaVLaS Where does that manifest-download and KfDef evaluation even happen in the code? Is it in the operator itself? I cannot find any references in the code, but then again maybe I am just too much a newbie.

I'td be interesting if around https://github.com/kubeflow/kfctl/blob/master/pkg/controller/kfdef/kfdef_controller.go and kfdef upstream in general, such issues as proxy, http authentication and custom CAs were ever an issue.

Asked downstream kfctl for opinion on adding SSL_CERT_DIR and SSL_CERT FILE kubeflow/kfctl#468

Second of all, this ticket here should have been closed long ago.

See https://github.com/kubeflow/kfctl/pull/326/commits

and

16eba4a

@shalberd
Copy link

shalberd commented Nov 30, 2022

@mediocrematt @PeterSulcs

Envs such as HTTP_PROXY, HTTPS_PROXY, NO_PROXY are present in the opendatahub-operator pod since at least ODH 1.4.0 and download from KfDef Url works just fine, provided the destination Url has correct SSL trust.

If you also want to have a workaround for custom CA, have a look at how to add a custom CA in the central openshift proxy config, then add the configmap in your openshift-operators namespace, and read over the custom CA documentation and custom subscription CRD spec.config here.

For enterprise-internal scenarios, your cluster administrators can also modify the cluster proxy CRD to enable additional trusted certificate CAs (root followed by intermediate for a certain PKI) in PEM format defined during cluster installation with additionalTrustBundle in install-config.yaml or after cluster installation.

From ODH 1.4.1 on, we will probably provide a mix-in configmap trusted-ca-bundle-odh pre-supplied that via the Cluster Network Operator merges the user-provided (additionally-trusted) and system CA certificates coming from the operating system of the cluster nodes into a single CA-bundle file in the configmap trusted-ca-bundle-odh that we reference in the operator subscription spec.config.

This is especially useful if you want to download manifest.tar.gz files from an enterprise-internal server location with private PKI-based SSL that is not publicly-trusted. Also, the publicly-trusted CAs as they are mixed in from the cluster network operator tend to be more up-to-date than what is contained within docker images.

kind: ConfigMap
apiVersion: v1
metadata:
  name: trusted-cabundle-odh
  namespace: openshift-operators
  labels:
    component: opendatahub-operator
    config.openshift.io/inject-trusted-cabundle: 'true'
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: opendatahub-operator
  namespace: openshift-operators
spec:
  channel: stable
  installPlanApproval: Automatic
  name: opendatahub-operator
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: opendatahub-operator.v1.4.1
  config: 
      selector:
        matchLabels:
          name: opendatahub-operator
      volumes: 
      - name: trusted-cabundle
        configMap:
          name: trusted-cabundle-odh
          items:
            - key: ca-bundle.crt 
              path: tls-ca-bundle.pem
          optional: true
      volumeMounts: 
      - name: trusted-cabundle
        mountPath: /etc/pki/ca-trust/extracted/pem
        readOnly: true

The title is a bit misleading, the operator is respecting proxy env variables when downloading manifests, it just until now has problems with servers that either have self-signed certificates or custom CA-based certificates. In any case, together with your cluster admins, setting additional trusted CAs in pem-format plus these two modifications above will make download work.

VaishnaviHire added a commit to VaishnaviHire/opendatahub-operator that referenced this issue Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants