Releases: nasa/cumulus
v2.0.8
Release v2.0.8
** Please note - this is a backport release for the 2.0.x release series. For the latest release as of this notice, see https://github.com/nasa/cumulus/releases/tag/v3.0.0
Fixed
- CUMULUS-2203
- Update Core tasks to use
cumulus-message-adapter-js
v1.3.2 to resolve memory leak/lambda ENOMEM constant failure issue. This
issue caused lambdas to slowly use all memory in the run environment and
prevented AWS from halting/restarting warmed instances when task code was
throwing consistent errors under load.
- Update Core tasks to use
v3.0.0
Migration Steps
Please be sure to follow all migration steps during your Cumulus deployment
Update Queue Workflow Configuration
-
All references to
meta.queues
in workflow configuration must be replaced with references to queue URLs from Terraform resources. See the updated data cookbooks or example Discover Granules workflow configuration. -
The steps for configuring queued execution throttling have changed. See the updated documentation.
-
In addition to the configuration for execution throttling, the internal mechanism for tracking executions by queue has changed. As a result, you should disable any rules or workflows scheduling executions via a throttled queue before upgrading. Otherwise, you may be at risk of having twice as many executions as are configured for the queue while the updated tracking is deployed. You can re-enable these rules/workflows once the upgrade is complete.
Deploy Cumulus EMS (optional)
-
EMS resources are now optional, and
ems_deploy
is set tofalse
by default, which will delete your EMS resources. -
If you would like to keep any deployed EMS resources, add the
ems_deploy
variable set totrue
in yourcumulus-tf/terraform.tfvars
TEA Deployment as a Separate Terraform Module
-
Before you re-deploy your
cumulus-tf
module, note that thethin-egress-app
is no longer deployed by default as part of thecumulus
module, so you must add the TEA module to your deployment and manually modify your Terraform state to avoid losing your API gateway and impacting any Cloudfront endpoints pointing to those gateways. If you don't care about losing your API gateway and impacting Cloudfront endpoints, you can ignore the instructions for manually modifying state.-
Add the
thin-egress-app
module to yourcumulus-tf
deployment as shown in the Cumulus example deployment.- Note that the values for
tea_stack_name
variable to thecumulus
module and thestack_name
variable to thethin_egress_app
module must match - Also, if you are specifying the
stage_name
variable to thethin_egress_app
module, the value of thetea_api_gateway_stage
variable to thecumulus
module must match it - If you were previously setting the
log_api_gateway_to_cloudwatch
variable for thecumulus
module, that variable should now be set directly on thethin_egress_app
module that is deployed separately. If you want these logs from TEA to be delivered to a shared log destination, you still need to set thelog_destination_arn
variable for thecumulus
module.
- Note that the values for
-
If you want to preserve your existing
thin-egress-app
API gateway and avoid having to update your Cloudfront endpoint for distribution, then you must follow these instructions: https://nasa.github.io/cumulus/docs/upgrade-notes/migrate_tea_standalone. Otherwise, you can re-deploy as usual.
-
-
If you provide your own custom bucket map to TEA as a standalone module, you must ensure that your custom bucket map includes mappings for the
protected
andpublic
buckets specified in yourcumulus-tf/terraform.tfvars
, otherwise Cumulus may not be able to determine the correct distribution URL for ingested files and you may encounter errors
Update Dashboard
Cumulus dashboard 2.0.0 has been released to work with this release
Breaking Changes
-
CUMULUS-2200
- Changes return from 303 redirect to 200 success for
Granule Inventory
's/reconciliationReport
returns. The user (dashboard) must read the value ofurl
from the return to get the s3SignedURL and then download the report.
- Changes return from 303 redirect to 200 success for
-
CUMULUS-2099
meta.queues
has been removed from Cumulus core workflow messages. (see migration steps above)@cumulus/sf-sqs-report
workflow task no longer reads the reporting queue URL frominput.meta.queues.reporting
on the incoming event. Instead, it requires that the queue URL be set as thereporting_queue_url
environment variable on the deployed Lambda.
-
CUMULUS-2111
- The deployment of the
thin-egress-app
module has be removed fromtf-modules/distribution
, which is a part of thetf-modules/cumulus
module. Thus, thethin-egress-app
module is no longer deployed for you by default. See the migration steps for details about how to add deployment for thethin-egress-app
.
- The deployment of the
-
CUMULUS-2141
- The
parse-pdr
task has been updated to respect theNODE_NAME
property in
a PDR'sFILE_GROUP
. If aNODE_NAME
is present, the task will query the
Cumulus API for a provider with that host. If a provider is found, the
output granule from the task will contain aprovider
property containing
that provider. IfNODE_NAME
is set but a provider with that host cannot be
found in the API, or if multiple providers are found with that same host,
the task will fail. - The
queue-granules
task has been updated to expect an optional
granule.provider
property on each granule. If present, the granule will be
enqueued using that provider. If not present, the task'sconfig.provider
will be used instead.
- The
-
CUMULUS-2197
- EMS resources are now optional and will not be deployed by default. See migration steps for information
about how to deploy EMS resources.
- EMS resources are now optional and will not be deployed by default. See migration steps for information
Breaking Code Changes
-
The
@cumulus/api-client.providers.getProviders
function now takes a
queryStringParameters
parameter which can be used to filter the providers
which are returned -
The
@cumulus/aws-client/S3.getS3ObjectReadStreamAsync
function has been
removed. It read the entire S3 object into memory before returning a read
stream, which could cause Lambdas to run out of memory. Use
@cumulus/aws-client/S3.getObjectReadStream
instead. -
The
@cumulus/ingest/util.lookupMimeType
function now returnsundefined
rather thannull
if the mime type could not be found. -
The
@cumulus/ingest/lock.removeLock
function now returnsundefined
-
The
@cumulus/ingest/granule.generateMoveFileParams
function now returns
source: undefined
andtarget :undefined
on the response object if either could not be
determined. Previously,null
had been returned. -
The
@cumulus/ingest/recursion.recursion
function must now be imported using
const { recursion } = require('@cumulus/ingest/recursion');
-
The
@cumulus/ingest/granule.getRenamedS3File
function has been renamed to
listVersionedObjects
Notable Changes
-
Added filter parameters to the Reconciliation Reports API
-
Added
update-cmr-access-constraints
workflow task to allow for granule metadata modification to hide or show granule. Rules can be configured via the MMT. See operator documentation. -
Added an internal reconciliation report for comparison between DynamoDB and Elasticsearch
-
Added a Granule Not Found reconciliation report for a breakdown by Granule of files both in DynamoDB and S3
-
Rules now support an
executionNamePrefix
property. If set, any executions triggered as a result of that rule will use that prefix in the name of the execution. See the notes for CUMULUS-2161 below.- Fixed a race condition with bulk granule delete causing deleted granules to still appear in Elasticsearch. Granules removed via bulk delete should now be removed from Elasticsearch.
All Changes
Added
-
CUMULUS-1855
- Fixed SyncGranule task to return an empty granules list when given an empty
(or absent) granules list on input, rather than throwing an exception
- Fixed SyncGranule task to return an empty granules list when given an empty
-
CUMULUS-1955
- Added
@cumulus/aws-client/S3.getObject
to get an AWS S3 object - Added
@cumulus/aws-client/S3.waitForObject
to get an AWS S3 object,
retrying, if necessary
- Added
-
CUMULUS-1961
- Adds
startTimestamp
andendTimestamp
parameters to endpoint
reconcilationReports
. Setting these values will filter the returned
report to cumulus data that falls within the timestamps. It also causes the
report to be one directional, meaning cumulus is only reconciled with CMR,
but not the other direction. The Granules will be filtered by their
updatedAt
values. Collections are filtered by the updatedAt time of their
granules, i.e. Collections with granules that are updatedAt a time between
the time parameters will be returned in the reconciliation reports. - Adds
startTimestamp
andendTimestamp
parameters to create-reconciliation-reports
lambda function. If either of these params is passed in with a value that can be
converted to a date object, the inter-platform comparison between Cumulus and CMR will
be one way. That is, collections, granules, and files will be filtered by time for
those found in Cumulus and only those compared to the CMR holdings. For the moment
there is not enough information to change the internal consistency check, and S3 vs
Cumulus comparisons are unchanged by the timestamps.
- Adds
-
CUMULUS-1962
- Adds
location
as parameter to/reconciliationReports
endpoint. Options areS3
resulting in a S3 vs. Cumulus database search orCMR
resulting in CMR vs. Cumulus database search.
- Adds
-
CUMULUS-1963
- Adds
granuleId
as input parameter to/reconcilationReports
endpoint. Limits inputs parameters to either `collectionI...
- Adds
v2.0.7
Fixed
- CVE-2020-7720
- Updated common
node-forge
dependency to 0.10.0 to address CVE finding
- Updated common
v2.0.6
Fixed
- CUMULUS-2168
- Fixed issue where large number of documents (generally logs) in the
cumulus
Elasticsearch index results in the collection granule stats
queries failing for the collections list api endpoint
- Fixed issue where large number of documents (generally logs) in the
v2.0.5
Added
- Added
thin_egress_stack_name
variable tocumulus
anddistribution
Terraform modules to allow overriding the default Cloudformation stack name used for thethin-egress-app
. Please note that if you change/set this value for an existing deployment, it will destroy and re-create your API gateway for thethin-egress-app
.
Fixed
- Fix collection list queries. Removed fixes to collection stats, which break queries for a large number of granules.
v2.0.4
v2.0.3
Fixed
-
CUMULUS-1961
- Fixed
activeCollections
query only returning 10 results
- Fixed
-
CUMULUS-2039
- Fix issue causing SyncGranules task to run out of memory on large granules
v2.0.2
Fixed
- CUMULUS-2116
- Fixed a race condition with bulk granule delete causing deleted granules to still appear in Elasticsearch. Granules removed via bulk delete should now be removed from Elasticsearch.
Added
- CUMULUS-2116
- Added
@cumulus/api/models/granule.unpublishAndDeleteGranule
which unpublishes a granule from CMR and deletes it from Cumulus, but does not update the record topublished: false
before deletion
- Added
v2.0.1
v2.0.0
Migration Steps
-
Upgrade your Cumulus dashboard to version 1.10.0
-
Due to an issue with the AWS API Gateway and how the Thin Egress App Cloudformation template applies updates, you may need to redeploy your
thin-egress-app-EgressGateway
manually as a one time migration step. If your deployment fails with an error similar to:Error: Lambda function (<stack>-tf-TeaCache) returned error: ({"errorType":"HTTPError","errorMessage":"Response code 404 (Not Found)"})
Then follow the AWS instructions to
Redeploy a REST API to a stage
for your egress API and re-runterraform apply
. -
Update rules to specify the
provider_path
and workflows to get theprovider_path
fromconfig.meta.provider_path
. Collections no longer support theprovider_path
property. -
Cumulus tasks using the
cumuluss/cumulus-ecs-task
Docker image must be updated tocumuluss/cumulus-ecs-task:1.7.0
to accommodate an upgrade to Node 12.18.0.
Breaking Changes
-
The minimum supported version of all published Cumulus packages is now Node
12.18.0- Tasks using the
cumuluss/cumulus-ecs-task
Docker image must be updated to
cumuluss/cumulus-ecs-task:1.7.0
. This can be done by updating theimage
property of any tasks defined using thecumulus_ecs_service
Terraform
module.
- Tasks using the
-
CUMULUS-1969
- The
DiscoverPdrs
task now expectsprovider_path
to be provided at
event.config.provider_path
, notevent.config.collection.provider_path
event.config.provider_path
is now a required parameter of the
DiscoverPdrs
taskevent.config.collection
is no longer a parameter to theDiscoverPdrs
task- Collections no longer support the
provider_path
property. The tasks that
relied on that property are now referencingconfig.meta.provider_path
.
Workflows should be updated accordingly.
- The
-
CUMULUS-1977
- Moved bulk granule deletion endpoint from
/bulkDelete
to
/granules/bulkDelete
- Moved bulk granule deletion endpoint from
Breaking Code Changes
-
Changes to the
@cumulus/cumulus-api
package- The
CumulusApiClientError
class must now be imported using
const { CumulusApiClientError } = require('@cumulus/cumulus-api/CumulusApiClientError')
- The
-
The
@cumulus/sftp-client/SftpClient
class must now be imported using
const { SftpClient } = require('@cumulus/sftp-client');
-
Instances of
@cumulus/ingest/SftpProviderClient
no longer implicitly connect
whendownload
,list
, orsync
are called. You must callconnect
on the
provider client before issuing one of those calls. Failure to do so will
result in a "Client not connected" exception being thrown. -
Instances of
@cumulus/ingest/SftpProviderClient
no longer implicitly
disconnect from the SFTP server whenlist
is called. -
Instances of
@cumulus/sftp-client/SftpClient
must now be explicitly closed
by calling.end()
-
Instances of
@cumulus/sftp-client/SftpClient
no longer implicitly connect to
the server whendownload
,unlink
,syncToS3
,syncFromS3
, andlist
are
called. You must explicitly callconnect
before calling one of those
methods. -
Changes to the
@cumulus/common
packagecloudwatch-event.getSfEventMessageObject()
now returnsundefined
if the
message could not be found or could not be parsed. It previously returned
null
.S3KeyPairProvider.decrypt()
now throws an exception if the bucket
containing the key cannot be determined.S3KeyPairProvider.decrypt()
now throws an exception if the stack cannot be
determined.S3KeyPairProvider.encrypt()
now throws an exception if the bucket
containing the key cannot be determined.S3KeyPairProvider.encrypt()
now throws an exception if the stack cannot be
determined.sns-event.getSnsEventMessageObject()
now returnsundefined
if it could
not be parsed. It previously returnednull
.- The
aws
module has been removed. - The
BucketsConfig.buckets
property is now read-only and private - The
test-utils.validateConfig()
function now resolves toundefined
rather thantrue
. - The
test-utils.validateInput()
function now resolves toundefined
rather
thantrue
. - The
test-utils.validateOutput()
function now resolves toundefined
rather thantrue
. - The static
S3KeyPairProvider.retrieveKey()
function has been removed.
-
Changes to the
@cumulus/cmrjs
package@cumulus/cmrjs.constructOnlineAccessUrl()
and
@cumulus/cmrjs/cmr-utils.constructOnlineAccessUrl()
previously took a
buckets
parameter, which was an instance of
@cumulus/common/BucketsConfig
. They now take abucketTypes
parameter,
which is a simple object mapping bucket names to bucket types. Example:
{ 'private-1': 'private', 'public-1': 'public' }
@cumulus/cmrjs.reconcileCMRMetadata()
and
@cumulus/cmrjs/cmr-utils.reconcileCMRMetadata()
now take a required
bucketTypes
parameter, which is a simple object mapping bucket names to
bucket types. Example:{ 'private-1': 'private', 'public-1': 'public' }
@cumulus/cmrjs.updateCMRMetadata()
and
@cumulus/cmrjs/cmr-utils.updateCMRMetadata()
previously took an optional
inBuckets
parameter, which was an instance of
@cumulus/common/BucketsConfig
. They now take a requiredbucketTypes
parameter, which is a simple object mapping bucket names to bucket types.
Example:{ 'private-1': 'private', 'public-1': 'public' }
-
Changes to
@cumulus/aws-client/S3
- The signature of the
getObjectSize
function has changed. It now takes a
params object with three properties:- s3: an instance of an AWS.S3 object
- bucket
- key
- The
getObjectSize
function will no longer retry if the object does not
exist
- The signature of the
-
CUMULUS-1861
@cumulus/message/Collections.getCollectionIdFromMessage
now throws a
CumulusMessageError
ifcollectionName
andcollectionVersion
are missing
frommeta.collection
. Previously this method would return
'undefined___undefined'
instead@cumulus/integration-tests/addCollections
now returns an array of collections that
were added rather than the count of added collections
-
CUMULUS-1930
- The
@cumulus/common/util.uuid()
function has been removed
- The
-
CUMULUS-1955
@cumulus/aws-client/S3.multipartCopyObject
now returns an object with the
AWSetag
of the destination object@cumulus/ingest/S3ProviderClient.list
now sets a file object'spath
property toundefined
instead ofnull
when the file is at the top level
of its bucket- The
sync
methods of the following classes in the@cumulus/ingest
package
now return an object with the AWSs3uri
andetag
of the destination file
(they previously returned only a string representing the S3 URI)FtpProviderClient
HttpProviderClient
S3ProviderClient
SftpProviderClient
-
CUMULUS-1958
- The following methods exported from
@cumulus/cmr-js/cmr-utils
were made
async, and added distributionBucketMap as a parameter:- constructOnlineAccessUrl
- generateFileUrl
- reconcileCMRMetadata
- updateCMRMetadata
- The following methods exported from
Notable Changes
-
CUMULUS-1991
- Updated CMR metadata generation to use "Download file.hdf" (where
file.hdf
is the filename of the given resource) as the resource description instead of "File to download" - CMR metadata updates now respect changes to resource descriptions (previously only changes to resource URLs were respected)
- Updated CMR metadata generation to use "Download file.hdf" (where
-
CUMULUS-1902
- Added Common Use Cases section under Operator Docs
-
CUMULUS-1417
- Added a
checksumFor
property to collectionfiles
config. Set this
property on a checksum file's definition matching theregex
of the target
file. More details in the 'Data Cookbooks
Setup'
documentation. - Added
checksumFor
validation to collections model.
- Added a
-
CUMULUS-1956
- The
/s3credentials
endpoint that is deployed as part of distribution now
supports authentication using tokens created by a different application. If
a request contains theEDL-ClientId
andEDL-Token
headers,
authentication will be handled using that token rather than attempting to
use OAuth. - If the
s3Credentials
endpoint is invoked with an EDL token and an
X-Request-Id
header, thatX-Request-Id
header will be forwarded to
Earthata Login.
- The
-
CUMULUS-1958
- Add the ability for users to specify a
bucket_map_key
to thecumulus
terraform module as an override for the default .yaml values that are passed
to TEA by Core. Using this option requires that each configured
Cumulus 'distribution' bucket (e.g. public/protected buckets) have a single
TEA mapping. Multiple maps per bucket are not supported. - Updated Generating a distribution URL, the MoveGranules task and all CMR
reconciliation functionality to utilize the TEA bucket map override. - Updated deploy process to utilize a bootstrap 'tea-map-cache' lambda that
will, after deployment of Cumulus Core's TEA instance, query TEA for all
protected/public buckets and generate a mapping configuration used
internally by Core. This object is also exposed as an output of the Cumulus
module asdistribution_bucket_map
. - docs
- Add the ability for users to specify a
-
CUMULUS-1982
- The `globalCo...