Releases: nasa/cumulus
v10.0.1
Release v10.0.1
Fixed
- Fixed IAM permissions issue with
<prefix>-postgres-migration-async-operation
Lambda
which prevented it from running a Fargate task for data migration.
v10.0.0
Release v10.0.0
Migration steps
- Please read the documentation on the updates to the granule files schema for our Cumulus workflow tasks and how to upgrade your deployment for compatibility.
- (Optional) Update the
task-config
for all workflows that use thesync-granule
task to includeworkflowStartTime
set to
{$.cumulus_meta.workflow_start_time}
. See here for an example.
BREAKING CHANGES
- NDCUM-624:
- Functions in @cumulus/cmrjs renamed for consistency with
isCMRFilename
andisCMRFile
isECHO10File
->isECHO10Filename
isUMMGFile
->isUMMGFilename
isISOFile
->isCMRISOFilename
- Functions in @cumulus/cmrjs renamed for consistency with
- CUMULUS-2388:
- In order to standardize task messaging formats, please note the updated input, output and config schemas for the following Cumulus workflow tasks:
- add-missing-file-checksums
- files-to-granules
- hyrax-metadata-updates
- lzards-backup
- move-granules
- post-to-cmr
- sync-granule
- update-cmr-access-constraints
- update-granules-cmr-metadata-file-links
The primary focus of the schema updates was to standardize the format of granules, and
particularly their files data. The granulefiles
object now matches the file schema in the
Cumulus database and thus also matches thefiles
object produced by the API with use cases like
applyWorkflow
. This includes removal ofname
andfilename
in favor ofbucket
andkey
,
removal of certain properties such asetag
andduplicate_found
and outputting them as
separate objects stored inmeta
.
- Checksum values calculated by
@cumulus/checksum
are now converted to string to standardize
checksum formatting across the Cumulus library.
- In order to standardize task messaging formats, please note the updated input, output and config schemas for the following Cumulus workflow tasks:
Notable changes
- CUMULUS-2718
- The
sync-granule
task has been updated to support an optional configuration parameterworkflowStartTime
. The output payload ofsync-granule
now includes acreatedAt
time for each granule which is set to the
providedworkflowStartTime
or falls back toDate.now()
if not provided. Workflows using
sync-granule
may be updated to include this parameter with the value of{$.cumulus_meta.workflow_start_time}
in thetask_config
.
- The
- Updated version of
@cumulus/cumulus-message-adapter-js
from2.0.3
to2.0.4
for
all Cumulus workflow tasks - CUMULUS-2783
- A bug in the ECS cluster autoscaling configuration has been
resolved. ECS clusters should now correctly autoscale by adding new cluster
instances according to the policy configuration. - Async operations that are started by these endpoints will be run as ECS tasks
with a launch type of Fargate, not EC2:POST /deadLetterArchive/recoverCumulusMessages
POST /elasticsearch/index-from-database
POST /granules/bulk
POST /granules/bulkDelete
POST /granules/bulkReingest
POST /migrationCounts
POST /reconciliationReports
POST /replays
POST /replays/sqs
- A bug in the ECS cluster autoscaling configuration has been
Added
- Upgraded version of dependencies on
knex
package from0.95.11
to0.95.15
- Added Terraform data sources to
example/cumulus-tf
module to retrieve default VPC and subnets in NGAP accounts- Added
vpc_tag_name
variable which defines the tags used to look up a VPC. Defaults to VPC tag name used in NGAP accounts - Added
subnets_tag_name
variable which defines the tags used to look up VPC subnets. Defaults to a subnet tag name used in NGAP accounts
- Added
- Added Terraform data sources to
example/data-persistence-tf
module to retrieve default VPC and subnets in NGAP accounts- Added
vpc_tag_name
variable which defines the tags used to look up a VPC. Defaults to VPC tag name used in NGAP accounts - Added
subnets_tag_name
variable which defines the tags used to look up VPC subnets. Defaults to a subnet tag name used in NGAP accounts
- Added
- Added Terraform data sources to
example/rds-cluster-tf
module to retrieve default VPC and subnets in NGAP accounts- Added
vpc_tag_name
variable which defines the tags used to look up a VPC. Defaults to VPC tag name used in NGAP accounts - Added
subnets_tag_name
variable which defines the tags used to look up VPC subnets. Defaults to tag names used in subnets in for NGAP accounts
- Added
- CUMULUS-2299
- Added support for SHA checksum types with hyphens (e.g.
SHA-256
vsSHA256
) to tasks that calculate checksums.
- Added support for SHA checksum types with hyphens (e.g.
- CUMULUS-2439
- Added CMR search client setting to the CreateReconciliationReport lambda function.
- Added
cmr_search_client_config
tfvars to the archive and cumulus terraform modules. - Updated CreateReconciliationReport lambda to search CMR collections with CMRSearchConceptQueue.
- CUMULUS-2441
- Added support for 'PROD' CMR environment.
- CUMULUS-2456
- Updated api lambdas to query ORCA Private API
- Updated example/cumulus-tf/orca.tf to the ORCA release v4.0.0-Beta3
- CUMULUS-2638
- Adds documentation to clarify bucket config object use.
- CUMULUS-2684
- Added optional collection level parameter
s3MultipartChunksizeMb
to collection'smeta
field - Updated
move-granules
task to take in an optional config parameter s3MultipartChunksizeMb
- Added optional collection level parameter
- CUMULUS-2747
- Updated data management type doc to include additional fields for provider configurations
- CUMULUS-2773
- Added a document to the workflow-tasks docs describing deployment, configuration and usage of the LZARDS backup task.
Changed
- Made
vpc_id
variable optional forexample/cumulus-tf
module - Made
vpc_id
andsubnet_ids
variables optional forexample/data-persistence-tf
module - Made
vpc_id
andsubnets
variables optional forexample/rds-cluster-tf
module - Changes audit script to handle integration test failure when
USE\_CACHED\_BOOTSTRAP
is disabled. - CUMULUS-1823
- Updates to Cumulus rule/provider schemas to improve field titles and descriptions.
- CUMULUS-2638
- Transparent to users, remove typescript type
BucketType
.
- Transparent to users, remove typescript type
- CUMULUS-2718
- Updated config for SyncGranules to support optional
workflowStartTime
- Updated SyncGranules to provide
createdAt
on output based onworkflowStartTime
if provided,
falling back toDate.now()
if not provided. - Updated
task_config
of SyncGranule in example workflows
- Updated config for SyncGranules to support optional
- CUMULUS-2735
- Updated reconciliation reports to write formatted JSON to S3 to improve readability for
large reports - Updated TEA version from 102 to 121 to address TEA deployment issue with the max size of
a policy role being exceeded
- Updated reconciliation reports to write formatted JSON to S3 to improve readability for
- CUMULUS-2743
- Updated bamboo Dockerfile to upgrade pip as part of the image creation process
- CUMULUS-2744
- GET executions/status returns associated granules for executions retrieved from the Step Function API
- CUMULUS-2751
- Upgraded all Cumulus (node.js) workflow tasks to use
@cumulus/cumulus-message-adapter-js
version2.0.3
, which includes an
update cma-js to better expose CMA stderr stream output on lambda timeouts
as well as minor logging enhancements.
- Upgraded all Cumulus (node.js) workflow tasks to use
- CUMULUS-2752
- Add new mappings for execution records to prevent dynamic field expansion from exceeding
Elasticsearch field limits- Nested objects under
finalPayload.*
will not dynamically add new fields to mapping - Nested objects under
originalPayload.*
will not dynamically add new fields to mapping - Nested keys under
tasks
will not dynamically add new fields to mapping
- Nested objects under
- Add new mappings for execution records to prevent dynamic field expansion from exceeding
- CUMULUS-2753
- Updated example/cumulus-tf/orca.tf to the latest ORCA release v4.0.0-Beta2 which is compatible with granule.files file schema
- Updated /orca/recovery to call new lambdas request_status_for_granule and request_status_for_job.
- Updated orca integration test
- PR #2569
- Fixed
TypeError
thrown by@cumulus/cmrjs/cmr-utils.getGranuleTemporalInfo
when
a granule's associated UMM-G JSON metadata file does not contain aProviderDates
element that has aType
of either"Update"
or"Insert"
. If neither are
present, the granule's last update date falls back to the"Create"
type
provider date, orundefined
, if none is present.
- Fixed
- CUMULUS-2775
- Changed
@cumulus/api-client/invokeApi()
to accept a single accepted status code or an array
of accepted status codes viaexpectedStatusCodes
- Changed
- PR #2611
- Changed
@cumulus/launchpad-auth/LaunchpadToken.requestToken
andvalidateToken
to use the HTTPS request optionhttps.pfx
instead of the deprecatedpfx
option
for providing the certificate.
- Changed
- CUMULUS-2836
- Updates
cmr-utils/getGranuleTemporalInfo
to search for a SingleDateTime
element, when beginningDateTime value is not
found in the metadata file. The granule's temporal information is
returned so that both beginningDateTime and endingDateTime are set to the
discovered singleDateTimeValue.
- Updates
- CUMULUS-2756
- Updated
_writeGranule()
inwrite-granules.js
to catch failed granule writes due to schema validation, log the failure and then attempt to set the status of the granule tofailed
if it already exists to prevent a failure from allowing the granule to get "stuck" in a non-failed status.
- Updated
Fixed
- CUMULUS-2775
- Updated
@cumulus/api-client
to not log an error for 201 response fromupdateGranule
- Updated
- CUMULUS-2783
- Added missing lower bound on scale out policy for ECS cluster to ensure that
the cluster will autoscale correctly.
- Added missing lower bound on scale out policy for ECS cluster to ensure that
- CUMULUS-2835
- Updated
hyrax-metadata-updates
t...
- Updated
v9.7.1
Release 9.7.1
This release is a bugfix patch release and supersedes release 9.2.3
Please note changes in 9.2.3 may not yet be released in future versions, as this is a backport and patch release on the 9.2.x series of releases. Updates that are included in the future will have a corresponding CHANGELOG entry in future releases.
Fixed
- CUMULUS-2751
- Update all tasks to update to use cumulus-message-adapter-js version 2.0.4
v10.0.0-beta.0
Release 10.0.0-beta.0
Migration steps
- Please read the documentation on the updates to the granule files schema for our Cumulus workflow tasks and how to upgrade your deployment for compatibility.
- (Optional) Update the
task-config
for all workflows that use thesync-granule
task to includeworkflowStartTime
set to
{$.cumulus_meta.workflow_start_time}
. See here for an example.
BREAKING CHANGES
- NDCUM-624:
- Functions in @cumulus/cmrjs renamed for consistency with
isCMRFilename
andisCMRFile
isECHO10File
->isECHO10Filename
isUMMGFile
->isUMMGFilename
isISOFile
->isCMRISOFilename
- Functions in @cumulus/cmrjs renamed for consistency with
- CUMULUS-2388:
- In order to standardize task messaging formats, please note the updated input, output and config schemas for the following Cumulus workflow tasks:
- add-missing-file-checksums
- files-to-granules
- hyrax-metadata-updates
- lzards-backup
- move-granules
- post-to-cmr
- sync-granule
- update-cmr-access-constraints
- update-granules-cmr-metadata-file-links
The primary focus of the schema updates was to standardize the format of granules, and
particularly their files data. The granulefiles
object now matches the file schema in the
Cumulus database and thus also matches thefiles
object produced by the API with use cases like
applyWorkflow
. This includes removal ofname
andfilename
in favor ofbucket
andkey
,
removal of certain properties such asetag
andduplicate_found
and outputting them as
separate objects stored inmeta
.
- Checksum values calculated by
@cumulus/checksum
are now converted to string to standardize
checksum formatting across the Cumulus library.
- In order to standardize task messaging formats, please note the updated input, output and config schemas for the following Cumulus workflow tasks:
Notable changes
- CUMULUS-2718
- The
sync-granule
task has been updated to support an optional configuration parameterworkflowStartTime
. The output payload ofsync-granule
now includes acreatedAt
time for each granule which is set to the
providedworkflowStartTime
or falls back toDate.now()
if not provided. Workflows using
sync-granule
may be updated to include this parameter with the value of{$.cumulus_meta.workflow_start_time}
in thetask_config
.
- The
Added
- CUMULUS-2439
- Added CMR search client setting to the CreateReconciliationReport lambda function.
- Added
cmr_search_client_config
tfvars to the archive and cumulus terraform modules. - Updated CreateReconciliationReport lambda to search CMR collections with CMRSearchConceptQueue.
- CUMULUS-2638
- Adds documentation to clarify bucket config object use.
Changed
- CUMULUS-2638
- Transparent to users, remove typescript type
BucketType
.
- Transparent to users, remove typescript type
- CUMULUS-2718
- Updated config for SyncGranules to support optional
workflowStartTime
- Updated SyncGranules to provide
createdAt
on output based onworkflowStartTime
if provided,
falling back toDate.now()
if not provided. - Updated
task_config
of SyncGranule in example workflows
- Updated config for SyncGranules to support optional
- CUMULUS-2744
- GET executions/status returns associated granules for executions retrieved from the Step Function API
- CUMULUS-2751
- Upgraded all Cumulus (node.js) workflow tasks to use
@cumulus/cumulus-message-adapter-js
version2.0.3
, which includes an
update cma-js to better expose CMA stderr stream output on lambda timeouts
as well as minor logging enhancements.
- Upgraded all Cumulus (node.js) workflow tasks to use
- CUMULUS-2752
- Add new mappings for execution records to prevent dynamic field expansion from exceeding
Elasticsearch field limits- Nested objects under
finalPayload.*
will not dynamically add new fields to mapping - Nested objects under
originalPayload.*
will not dynamically add new fields to mapping - Nested keys under
tasks
will not dynamically add new fields to mapping
- Nested objects under
- Add new mappings for execution records to prevent dynamic field expansion from exceeding
- PR #2569
- Fixed
TypeError
thrown by@cumulus/cmrjs/cmr-utils.getGranuleTemporalInfo
when
a granule's associated UMM-G JSON metadata file does not contain aProviderDates
element that has aType
of either"Update"
or"Insert"
. If neither are
present, the granule's last update date falls back to the"Create"
type
provider date, orundefined
, if none is present.
- Fixed
v9.2.4
Release v9.2.4
This release is a bugfix patch release and supersedes release 9.2.3
Please note changes in 9.2.3 may not yet be released in future versions, as this is a backport and patch release on the 9.2.x series of releases. Updates that are included in the future will have a corresponding CHANGELOG entry in future releases.
Fixed
CUMULUS-2751
- Update all tasks to update to use cumulus-message-adapter-js version 2.0.3
v9.9.0
Release v9.9.0
Added
- PR #2535
- NSIDC and other cumulus users had desire for returning formatted dates for
the 'url_path' date extraction utilities. Added 'dateFormat' function as
an option for extracting and formating the entire date. See
docs/workflow/workflow-configuration-how-to.md for more information.
- NSIDC and other cumulus users had desire for returning formatted dates for
- PR #2548
- Updated webpack configuration for html-loader v2
- CUMULUS-2640
- Added Elasticsearch client scroll setting to the CreateReconciliationReport lambda function.
- Added
elasticsearch_client_config
tfvars to the archive and cumulus terraform modules.
Changed
- Upgraded all Cumulus workflow tasks to use
@cumulus/cumulus-message-adapter-js
version2.0.1
- CUMULUS-2725
- Updated providers endpoint to return encrypted password
- Updated providers model to try decrypting credentials before encryption to allow for better handling of updating providers
- CUMULUS-2734
- Updated
@cumulus/api/launchpadSaml.launchpadPublicCertificate
to correctly retrieve
certificate from launchpad IdP metadata with and without namespace prefix.
- Updated
v9.8.0
Notable changes
- Published new tag
36
ofcumuluss/async-operation
to Docker Hub for compatibility with
upgrades toknex
package and to address security vulnerabilities.
All Changes
Added
-
Added
@cumulus/db/createRejectableTransaction()
to handle creating a Knex transaction that will throw an error if the transaction rolls back. As of Knex 0.95+, promise rejection on transaction rollback is no longer the default behavior. -
CUMULUS-2639
- Increases logging on reconciliation reports.
-
CUMULUS-2670
- Updated
lambda_timeouts
string map variable forcumulus
module to accept a
update_granules_cmr_metadata_file_links_task_timeout
property
- Updated
Changed
-
Updated
knex
version from 0.23.11 to 0.95.11 to address security vulnerabilities -
Updated default version of async operations Docker image to
cumuluss/async-operation:36
-
CUMULUS-2590
- Granule applyWorkflow, Reingest actions and Bulk operation now update granule status to
queued
when scheduling the granule.
- Granule applyWorkflow, Reingest actions and Bulk operation now update granule status to
-
CUMULUS-2643
- relocates system file
buckets.json
out of the
s3://internal-bucket/workflows
directory into
s3://internal-bucket/buckets
.
- relocates system file
v9.7.0
Notable Changes
-
CUMULUS-2583
- The
queue-granules
task now updates granule status toqueued
when a granule is queued. In order to prevent issues with the private API endpoint and Lambda API request and concurrency limits, this functionality runs with limited concurrency, which may increase the task's overall runtime when large numbers of granules are being queued. If you are facing Lambda timeout errors with this task, we recommend converting yourqueue-granules
task to an ECS activity. This concurrency is configurable via the task config'sconcurrency
value.
- The
-
CUMULUS-2676
- The
discover-granules
task has been updated to limit concurrency on checks to identify and skip already ingested granules in order to prevent issues with the private API endpoint and Lambda API request and concurrency limits. This may increase the task's overall runtime when large numbers of granules are discovered. If you are facing Lambda timeout errors with this task, we recommend converting yourdiscover-granules
task to an ECS activity. This concurrency is configurable via the task config'sconcurrency
value. - Updated memory of
<prefix>-sfEventSqsToDbRecords
Lambda to 1024MB
- The
-
CUMULUS-2000
- The Queue Granules task can be configured to batch multiple granules into workflow payloads
All Changes
Added
-
CUMULUS-2000
- Updated
@cumulus/queue-granules
to respect a new config parameter:preferredQueueBatchSize
. Queue-granules will respect this batch size as best as it can to batch granules into workflow payloads. As workflows generally rely on information such as collection and provider expected to be shared across all granules in a workflow, queue-granules will break batches up by collection, as well as provider if there is aprovider
field on the granule. This may result in batches that are smaller than the preferred size, but never larger ones. The default value is 1, which preserves the current behavior of queueing 1 granule per workflow.
- Updated
-
CUMULUS-2630
- Adds a new workflow
DiscoverGranulesToThrottledQueue
that discovers and writes
granules to a throttled background queue. This allows discovery and ingest
of larger numbers of granules without running into limits with lambda
concurrency.
- Adds a new workflow
Changed
- CUMULUS-2695
- Updates the example/cumulus-tf deployment to change
archive_api_reserved_concurrency
from 8 to 5 to use fewer reserved lambda
functions. If you see throttling errors on the<stack>-apiEndpoints
you
should increase this value. - Updates cumulus-tf/cumulus/variables.tf to change
archive_api_reserved_concurrency
from 8 to 15 to prevent throttling on
the dashboard for default deployments.
- Updates the example/cumulus-tf deployment to change
v9.6.0
Release v9.6.0
Added
- CUMULUS-2576
- Adds
PUT /granules
API endpoint to update a granule - Adds helper
updateGranule
to@cumulus/api-client/granules
- Adds
- CUMULUS-2606
- Adds
POST /granules/{granuleId}/executions
API endpoint to associate an execution with a granule - Adds helper
associateExecutionWithGranule
to@cumulus/api-client/granules
- Adds
- CUMULUS-2583
- Adds
queued
as option for granule'sstatus
field
- Adds
Changed
- Moved
ssh2
package from@cumulus/common
to@cumulus/sftp-client
and
upgraded package from^0.8.7
to^1.0.0
to address security vulnerability
issue in previous version. - CUMULUS-2583
QueueGranules
task now updates granule status toqueued
once it is added to the queue.
Fixed
- Added missing permission for
<prefix>_ecs_cluster_instance_role
IAM role (used when running ECS services/tasks)
to allowkms:Decrypt
on the KMS key used to encrypt provider credentials. Adding this permission fixes thesync-granule
task when run as an ECS activity in a Step Function, which previously failed trying to decrypt credentials for providers.
v9.5.0
BREAKING CHANGES
- Removed
logs
record type from mappings from Elasticsearch. This change should not have
any adverse impact on existing deployments, even those which still containlogs
records,
but technically it is a breaking change to the Elasticsearch mappings. - Changed
@cumulus/api-client/asyncOperations.getAsyncOperation
to return parsed JSON body
of response and not the raw API endpoint response
Added
-
CUMULUS-2670
- Updated core
cumulus
module to take lambda_timeouts string map variable that allows timeouts of ingest tasks to be configurable. Allowed properties for the mapping include: - discover_granules_task_timeout
- discover_pdrs_task_timeout
- hyrax_metadata_update_tasks_timeout
- lzards_backup_task_timeout
- move_granules_task_timeout
- parse_pdr_task_timeout
- pdr_status_check_task_timeout
- post_to_cmr_task_timeout
- queue_granules_task_timeout
- queue_pdrs_task_timeout
- queue_workflow_task_timeout
- sync_granule_task_timeout
- Updated core
-
CUMULUS-2575
- Adds
POST /granules
API endpoint to create a granule - Adds helper
createGranule
to@cumulus/api-client
- Adds
-
CUMULUS-2577
- Adds
POST /executions
endpoint to create an execution
- Adds
-
CUMULUS-2578
- Adds
PUT /executions
endpoint to update an execution
- Adds
-
CUMULUS-2592
- Adds logging when messages fail to be added to queue
-
CUMULUS-2644
- Pulled
delete
method forgranules-executions.ts
implemented as part of CUMULUS-2306
from the RDS-Phase-2 feature branch in support of CUMULUS-2644. - Pulled
erasePostgresTables
method inserve.js
implemented as part of CUMULUS-2644,
and CUMULUS-2306 from the RDS-Phase-2 feature branch in support of CUMULUS-2644 - Added
resetPostgresDb
method to support resetting between integration test suite runs
- Pulled
Changed
-
Updated
processDeadLetterArchive
Lambda to return an object where
processingSucceededKeys
is an array of the S3 keys for successfully
processed objects andprocessingFailedKeys
is an array of S3 keys
for objects that could not be processed -
Updated async operations to handle writing records to the databases
when output of the operation isundefined
-
CUMULUS-2644
- Moved
migration
directory from thedb-migration-lambda
to thedb
package and
updated unit test references to migrationDir to be pulled from@cumulus/db
- Updated
@cumulus/api/bin/serveUtils
to write records to PostgreSQL tables
- Moved
-
CUMULUS-2575
- Updates model/granule to allow a granule created from API to not require an
execution to be associated with it. This is a backwards compatible change
that will not affect granules created in the normal way. - Updates
@cumulus/db/src/model/granules
functionsget
andexists
to
enforce parameter checking so that requests include either (granule_id
and collection_cumulus_id) or (cumulus_id) to prevent incorrect results. @cumulus/message/src/Collections.deconstructCollectionId
has been
modified to throw a descriptive error if the inputcollectionId
is
undefined rather thanTypeError: Cannot read property 'split' of undefined
. This function has also been updated to throw descriptive errors
if an incorrectly formated collectionId is input.
- Updates model/granule to allow a granule created from API to not require an