Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Thredds to supported version #413

Merged
merged 25 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d0cbc3d
update thredds to version 5.4
mishaschwartz Dec 14, 2023
cf56f8c
thredds: update to latest version and apply WMS fix
tlvu Jan 17, 2024
32cef56
merge with master
tlvu Jan 17, 2024
9996a3c
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Feb 16, 2024
938ac18
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Feb 23, 2024
229256d
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Mar 23, 2024
e1c5b27
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Mar 25, 2024
897f972
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Apr 4, 2024
7c72500
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Apr 15, 2024
a4ea349
Merge branch 'master' into update-thredds-5.4
mishaschwartz May 1, 2024
f3264ac
add THREDDS_TAGGED variable since THREDDS_VERSION is not a proper sem…
mishaschwartz May 1, 2024
bb2d0c4
Merge branch 'master' into update-thredds-5.4
mishaschwartz May 1, 2024
014576a
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Sep 10, 2024
70b5fe0
thredds: update to version 5.5-unidata-2024-09-10
tlvu Sep 10, 2024
8df5da1
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Sep 17, 2024
439789f
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Oct 1, 2024
e6e4512
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Nov 19, 2024
f231535
thredds: update NCSS NetcdfSubset service config for v5
tlvu Nov 20, 2024
3bde344
thredds: use v5.5 with missing jar
tlvu Nov 21, 2024
154caf5
treat new service paths as path-prefixes in magpie
mishaschwartz Nov 21, 2024
e2ec935
Revert "treat new service paths as path-prefixes in magpie"
mishaschwartz Nov 21, 2024
bafcafc
testhredds: wrong context-root variable
tlvu Nov 22, 2024
936a8ee
testthredds: fix typo in catalog.xml in previous commit
tlvu Nov 22, 2024
43646c5
Merge remote-tracking branch 'origin/master' into update-thredds-5.4
tlvu Dec 5, 2024
ba4d044
Bump version: 2.6.2 → 2.6.3
tlvu Dec 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,12 @@
[Unreleased](https://github.com/bird-house/birdhouse-deploy/tree/master) (latest)
------------------------------------------------------------------------------------------------------------------

[//]: # (list changes here, using '-' for each new entry, remove this when items are added)
## Changes

- Update Thredds to supported version

Unidata has dropped support for TDS versions < 5.x. This updates Thredds to version 5.5.


[2.6.0](https://github.com/bird-house/birdhouse-deploy/tree/2.6.0) (2024-11-19)
------------------------------------------------------------------------------------------------------------------
Expand Down
3 changes: 2 additions & 1 deletion birdhouse/components/thredds/catalog.xml.template
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@
<service name="iso" serviceType="ISO" base="${TWITCHER_PROTECTED_PATH}/thredds/iso/"/>
<service name="wcs" serviceType="WCS" base="${TWITCHER_PROTECTED_PATH}/thredds/wcs/" />
<service name="wms" serviceType="WMS" base="${TWITCHER_PROTECTED_PATH}/thredds/wms/" />
<service name="subsetServer" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss/" />
<service name="ncssGrid" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss/grid/" />
<service name="ncssPoint" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss/point/" />
</service>

<datasetScan name="${THREDDS_SERVICE_DATA_LOCATION_NAME}" ID="${THREDDS_SERVICE_DATA_URL_PATH}" path="${THREDDS_SERVICE_DATA_URL_PATH}" location="${THREDDS_SERVICE_DATA_LOCATION_ON_CONTAINER}">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ SERVICES['Thredds'] = {
'name': 'Thredds',
'synopsis': 'Climate Data Catalog and Format Renderers',
'version': "${THREDDS_VERSION}",
'releaseTime': get_release_time_from_repo_tag("docker", "${THREDDS_DOCKER}", "${THREDDS_VERSION}"),
'releaseTime': get_release_time_from_repo_tag("docker", "${THREDDS_DOCKER}", "${THREDDS_TAGGED}"),
'institution': 'Ouranos',
'researchSubject': 'Catalog',
'supportEmail': '${BIRDHOUSE_SUPPORT_EMAIL}',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ providers:
dodsC,
wcs,
wms,
ncss,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave the old location.
THREDDS_VERSION can be overridden, so a server could still employ the older variant.

Given that, that will most probably cause a conflict in the data_type resolution order if all 3 are defined, so maybe alternate paths or a dynamic variable resolution will be needed here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh... you're right. Ok I'll figure something out.

ncss/grid,
ncss/point,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test if this works?

I do not know if Magpie/Twitcher handles this properly.
The data types were not planned to include a /, so I'm not sure if a url.split("/") might happen somewhere in the code leading to misinterpretation of the targeted data_type.

Copy link
Collaborator Author

@mishaschwartz mishaschwartz Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemed to be fine but I didn't rigorously test by trying to read the two options with different directory permissions I guess. I'll look into that.

BUT... if these prefixes can't handle a / we should change the service definitions to:

<service name="ncssGrid" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss-grid/" />
<service name="ncssPoint" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss-point/" />

or similar and that way we don't have to worry about that at all

Copy link
Collaborator Author

@mishaschwartz mishaschwartz Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok maybe we can't change the path... the URL construction docs for this service look like point or grid needs to be a subpath under ncss:

https://docs.unidata.ucar.edu/tds/current/userguide/netcdf_subset_service_ref.html#url-construction

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data types were not planned to include a /, so I'm not sure if a url.split("/") might happen somewhere in the code leading to misinterpretation of the targeted data_type.

Yup magpie splits it up (see here) so our only option at this point is to keep the magpie config as it is and treat grid/ and point/ as directory paths if we want to differentiate their magpie permissions.

That's going to cause other confusion though...

Right now we have URL paths like:

  • ${FQDN}/twitcher/ows/proxy/thredds/ncss/datasets/...
  • ${FQDN}/twitcher/ows/proxy/thredds/dodsC/datasets/...

So if we want to set specific permissions on the datasets/ directory we can do that by setting one directory permission rule on the datasets/ subdirectory and it will work for all services.

But if we instead have:

  • ${FQDN}/twitcher/ows/proxy/thredds/ncss/point/datasets/...
  • ${FQDN}/twitcher/ows/proxy/thredds/dodsC/datasets/...

Then we would need to set the same permission rule on datasets/ as well as point/datasets/. If we don't set the same rule on point/datasets/ then the rule won't apply to resources accessed by the ncss/point service.

Please let me know if I'm interpreting this correctly @fmigneault

If I'm right then I think we need to update Magpie to handle this use case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to revert my changes for now until we figure something out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlvu Magpie cannot handle this case currently because it wasn't designed for this. It assumes THREDDS "services" don't contain a /, so it could rely on {Twitcher-prefix}/{THREDDS_name}/{THREDDS_service}/{rest-as-dir/file}.

@fmigneault I am still lost. What case are you talking about?

Just to be clear, in the current state, with Misha's proposed Magpie config change rollback, do we have any problems?

Been running some Jenkins on the new Thredds and I am having some weird errors. Not sure if it's my test system or the code. So if you foresee some problem, please let me know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ncssGrid and ncssPoint services won't be protected like other services do (ie: a common browse/read/write set for all services) since the paths are not handled.
For the moment, if you want them protected, you need to duplicate the permission hierarchy.

i.e.:

A file accessed by ncssGrid via ${FQDN}/twitcher/ows/proxy/thredds/ncss/point/datasets/sub/file.nc will actually go through ncss service, and point will be considered by Magpie as if it was any other directory (like datasets or sub).

Therefore, the same datasets/sub/file.nc would need 3 sets of permissions:

  • datasets/sub/file.nc => (browse,read,write) for anything currently handled
  • grid/datasets/sub/file.nc => (read) only if accessing via ncssGrid
  • point/datasets/sub/file.nc => (read) only if accessing via ncssPoint

and those (read) need to be duplicated for every user/group/dir/file combination applicable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ncssGrid and ncssPoint services won't be protected like other services do (ie: a common browse/read/write set for all services) since the paths are not handled.

@fmigneault I am surprised. If we "Deny or Allow" everything under ncss path, isn't both of those variant grid/datasets/sub/file.nc and point/datasets/sub/file.nc will be properly denied or allowed, because they are all under ncss/ path?

Under ncss is read only no write anyways, since there is no write for that kind of path.

Copy link
Collaborator

@fmigneault fmigneault Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlvu
Yes.
A deny top-level will restrict everything lower. But usually, a deny is "undone" by an allow on a lower specific resource we want to grant access. The same issue also happens even when working the other way around -- allow recursive followed by restricted deny lower (see below example).

With the current implementation of THREDDS that checks the service [accses-mode] (dodC, ncss, etc.) between Twitcher-proxy-path and the rest of the dir/file path, those resources would be at different "level" for the same file reference from the point of view of Magpie. So you would need to duplicate your 'allows'/'denies' across [accses-mode].

RESOURCE        PERMISSION APPLIED / RESULT
===========================================
THREDDS         allow-recursive

ncss            [access-mode] (not a resource, cannot have permissions directly)
  datasets      deny-recursive
     sub        allow-recursve (all but sub's contents are blocked)

  public        allow-recursive
     secure     deny-recursive
       secret   allow-match only for user-1, maybe group "manager" also, others...

  point         <-- THREDDS sees this as another [access-mode], but Magpie thinks its a RESOURCE!
     datasets   
       sub      <==== OH! OH! undesired full open-access (it's not under the denied "/datasets")

     public     allow-recursive (but because on THREDDS)
       secure   if you forget to replicate above 'secure', this is still full open acces

  grid          (repeat again, another set of duplicat permissions, for each group/user combination)
    ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A deny top-level will restrict everything lower. But usually, a deny is "undone" by an allow on a lower specific resource we want to grant access. The same issue also happens even when working the other way around -- allow recursive followed by restricted deny lower (see below example).

Oh OK !!! Now I understand, thanks ! Top-level Allow or Deny will work. But adding an exception under the top-level will not.

${THREDDS_MAGPIE_EXTRA_DATA_PREFIXES}
]
6 changes: 4 additions & 2 deletions birdhouse/components/thredds/default.env
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# All env in this default.env can be overridden by env.local.

# thredds-docker >= 4.6.18 or >= 5.2 strongly recommended to avoid Log4J CVE-2021-44228.
export THREDDS_VERSION=4.6.18-unidata-2022-01
export THREDDS_TAGGED="5.5-unidata-2024-11-19-with-tds-plugin-jar-with-dependencies.jar"
export THREDDS_VERSION="5.5.0-unidata-2024-11-19"
export THREDDS_DOCKER=pavics/thredds-docker
export THREDDS_IMAGE='${THREDDS_DOCKER}:${THREDDS_VERSION}'
export THREDDS_IMAGE='${THREDDS_DOCKER}:${THREDDS_TAGGED}'
export THREDDS_IMAGE_URI='registry.hub.docker.com/${THREDDS_IMAGE}'
export THREDDS_ORGANIZATION="Birdhouse"
export THREDDS_ADDITIONAL_CATALOG=""
Expand Down Expand Up @@ -54,6 +55,7 @@ OPTIONAL_VARS="
\$THREDDS_ORGANIZATION
\$TWITCHER_PROTECTED_PATH
\$THREDDS_DOCKER
\$THREDDS_TAGGED
\$THREDDS_VERSION
\$THREDDS_IMAGE
\$THREDDS_IMAGE_URI
Expand Down
2 changes: 1 addition & 1 deletion birdhouse/components/thredds/threddsConfig.xml.template
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ https://www.unidata.ucar.edu/software/tds/current/reference/ThreddsConfigXMLFile
</NCISO>

<!--
https://www.unidata.ucar.edu/software/tds/current/reference/ThreddsConfigXMLFile.html#DiskCache
https://docs.unidata.ucar.edu/tds/current/userguide/tds_config_ref.html#cdm-library-disk-cache

These elements control where the CDM/NetCDF-Java library writes temporary
files, for example when it needs to unzip files, or write GRIB index files,
Expand Down
2 changes: 1 addition & 1 deletion birdhouse/components/thredds/wmsConfig.xml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE wmsConfig SYSTEM "http://www.unidata.ucar.edu/schemas/thredds/dtd/ncwms/wmsConfig.dtd">
<!DOCTYPE wmsConfig SYSTEM "https://schemas.unidata.ucar.edu/thredds/dtd/ncwms/wmsConfig.dtd">
<!--
Detailed configuration of the WMS service. This config file can be used to
set default styling parameters for each dataset/variable, and to enable or disable
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@
<service name="iso" serviceType="ISO" base="/${TESTTHREDDS_CONTEXT_ROOT}/iso/"/>
<service name="wcs" serviceType="WCS" base="/${TESTTHREDDS_CONTEXT_ROOT}/wcs/" />
<service name="wms" serviceType="WMS" base="/${TESTTHREDDS_CONTEXT_ROOT}/wms/" />
<service name="subsetServer" serviceType="NetcdfSubset" base="/${TESTTHREDDS_CONTEXT_ROOT}/ncss/" />
<service name="ncssGrid" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss/grid/" />
<service name="ncssPoint" serviceType="NetcdfSubset" base="${TWITCHER_PROTECTED_PATH}/thredds/ncss/point/" />
</service>

<datasetScan name="TestDatasets" ID="testdatasets" path="testdatasets" location="/birdhouse-testdata">
Expand Down
2 changes: 1 addition & 1 deletion birdhouse/optional-components/testthredds/wmsConfig.xml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE wmsConfig SYSTEM "http://www.unidata.ucar.edu/schemas/thredds/dtd/ncwms/wmsConfig.dtd">
<!DOCTYPE wmsConfig SYSTEM "https://schemas.unidata.ucar.edu/thredds/dtd/ncwms/wmsConfig.dtd">
<!--
Detailed configuration of the WMS service. This config file can be used to
set default styling parameters for each dataset/variable, and to enable or disable
Expand Down
Loading