Feature/issue 211 - Query track ingest table for granules with "to_in…

…gest" status (#227) * /version 1.3.0a0 * Update build.yml * /version 1.3.0a1 * /version 1.3.0a2 * Feature/issue 175 - Update docs to point to OPS (#176) * changelog * update examples, remove load_data readme, info moved to wiki * Dependency update to fix snyk scan * issues/101: Support for HTTP Accept header (#172) * Reorganize timeseries code to prep for Accept header * Enable Accept header to return response of specific content-type * Fix whitespace and string continuation * Make error handling consistent and add an additional test where a reach can't be found * Update changelog with issue for unreleased version * Add 415 status code to API definition * Few minor cleanup items * Few minor cleanup items * Update to [email protected] * Fix dependencies --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.3.0a3 * issues/102: Support compression of API response (#173) * Enable payload compression * Update changelog with issue --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.3.0a4 * Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177) * Reorganize timeseries code to prep for Accept header * Enable Accept header to return response of specific content-type * Fix whitespace and string continuation * Make error handling consistent and add an additional test where a reach can't be found * Update changelog with issue for unreleased version * Add 415 status code to API definition * Few minor cleanup items * Few minor cleanup items * Update to [email protected] * Fix dependencies * Update required query parameters based on current API functionality * Enable return of 'compact' GeoJSON response * Fix linting and add test data * Update documentation for API accept headers and compact GeoJSON response * Fix references to incorrect Accept header examples --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.3.0a5 * Feature/issue 183 (#185) * Provide introduction to timeseries endpoint * Remove _units in fields list * Fix typo * Update examples with Accept headers and compact query parameter * Add issue to changelog * Fix typo in timeseries documentation * Update pymysql * Update pymysql * Provide clarity on accept headers and request parameter fields * /version 1.3.0a6 * Feature/issue 186 Implement API keys (#188) * API Gateway Lambda authorizer to facilitate API keys and usage plans * Unit tests to test Lambda authorizer * Fix terraform file formatting * API Gateway Lambda Authorizer - Lambda function - API Keys and Authorizer definition in OpenAPI spec - API gateway API keys - API gateway usage plans - SSM parameters for API keys * Fix trailing whitespace * Set default region environment variable * Fix SNYK vulnerabilities * Add issue to changelog * Implement custom trusted partner header x-hydrocron-key * Update cryptography for SNYK vulnerability * Update documentation to include API key usage * Update quota and throttle settings for API Gateway * Update API keys documentation to indicate to be implemented * Move API key lookup to Lambda INIT * Remove API key authentication and update API key to x-hydrocron-key * /version 1.3.0a7 * Update changelog for 1.3.0 release * /version 1.4.0a0 * Feature/issue 198 (#207) * Update pylint to deal with errors and fix collection reference * Initial CMR and Hydrocron queries - Includes placeholders for other operations needed to track granule ingest. - GranuleUR query for Hydrocron tables. * Add and set up vcrpy for testing CMR API query * Test track ingest operations - Test CMR and hydrocron queries - Test granuleUR query - Update database to include granuleUR GSI * Update to use track_ingest naming consistently * Initial Lambda function and IAM role definition * Replace deprecated path function with as_file * Add SSM read IAM permissions * Add DynamoDB read permissions * Update track ingest lambda memory * Remove duplicate IAM permissions * Add in permissions to query index * Update changelog * Update changelog description * Use python_cmr for CMR API queries * /version 1.4.0a1 * Add doi to documentation pages (#216) * Update intro.md with DOI * Update overview.md with DOI * /version 1.4.0a2 * issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209) * add code to handle prior lakes shapefiles, add test prior lake data * update terraform to add prior lake table * fix tests, change to smaller test data file, changelog * linting * reconfigure main load_data method to make more readable and pass linting * lint * lint * fix string casting to lower storage req & update test responses to handle different rounding pattern in coords * update load benchmarking function for linting and add unit test * try parent collection for lakes * update version parsing for parent collection * fix case error * fix lake id reference * add logging to troubleshoot too large features * add item size logging and remove error raise for batch write * clean up logging statements & move numeric_columns assignment * update batch logging statement * Rename constant * Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/ * Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/ * fix code coverage calculation --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.4.0a3 * Feature/issue 201 Create a table for tracking granule ingest status (#214) * Define track ingest database and IAM permissions * Update changelog with issue * Modify table structure to support sparse status index * Updated to only apply PITR in ops --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.4.0a4 * Feature/issue 210 - Load large geometry polygons (#219) * add functions to handle null geometries and convert polygons to points * update doi in docs * fix fill null geometries * fix tests and update changelog * /version 1.4.0a5 * Feature/issue 222 - Add granule info to track ingest table on load (#223) * adjust lambdas to populate track ingest table on granule load * changelog * remove test cnm * lint * change error caught when handling checksum * update lambda role permissions to write to track ingest table * fix typo on lake table terraform * set default fill values for checksum and rev date in track status * fix checksum handling in bulk load data * lint * add logging to debug * /version 1.4.0a6 * Add SSM parameter read for last run time * Feature/issue-225: Create one track ingest table per feature type (#226) * add track ingest tables for each feature type and adjust load data to populate * changelog * /version 1.4.0a7 * Feature/issue 196 Add new feature type to query the API for lake data (#224) * Initial API queries for lake data * Unit tests for lake data * Updates after center point calculations - Removed temp code to calculate a point in API - Implemented unit test to test lake data retrieval - Updated fixtures to load in lake data for testing * Add read lake table permissions to lambda timeseries and track ingest roles * Update documenation to include lake data * Updated documentation to include info on lake centerpoints --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.4.0a8 * Feature/issue 205 - Add Confluence API key (#221) * Fix possible variable references before value is assigned * Define Confluence API key and trusted partner plan limits * Define a list of trusted partner keys and store under single parameter * Define API keys as encrypted envrionment variables for Lambda authorizer * Update authorizer and connection class to use KMS to retrieve API keys * Hack to force lambda deployment when ssm value changes (#218) * Add replace_triggered_by to hydrocron_lambda_authorizer * Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function. * Remove unnecessary hash function * Update to SSM parameter API key storage and null_resource enviroment variable * Update Terraform and AWS provider * Update API key documentation * Set source_code_hash to force deployment of new image * Downgrade AWS provider to 4.0 to remove inline policy errors * Update docs/timeseries.md --------- Co-authored-by: Frank Greguska <[email protected]> * /version 1.4.0a9 * /version 1.4.0a10 * changelog for 1.4.0 release * /version 1.5.0a0 * Initial track ingest table query * Fix linting and code style * Implement feature count operations * Enable S3 permissions and set environment variable for track lambda * Fix trailing white spaces and code format * Update docstrings for class methods * Implement run time storage in SSM * Query track table unit tests * Update CHANGELOG with issue * Update SSM run time parameter * Fix trailing whitespace * Fix reference to IAM policy * Enable specification of temporal range to search revision date by * Fix SSM put parameter policy * Update IAM permissions for reading track ingest * Enable full temporal search on CMR granules * Add capability to download shapefile granule to count features * Update granule UR to include .zip * Count features via Hydrocron table query * Remove unnecessary s3 permissions * Remove whitespace from blank line * Update cryptography to 43.0.1 * update dependencies * upgrade geopandas * update dependencies --------- Co-authored-by: nikki-t <[email protected]> Co-authored-by: Frank Greguska <[email protected]> Co-authored-by: frankinspace <[email protected]> Co-authored-by: Victoria McDonald <[email protected]> Co-authored-by: Cassie Nickles <[email protected]> Co-authored-by: cassienickles <[email protected]> Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com> Co-authored-by: Victoria McDonald <[email protected]> Co-authored-by: torimcd <[email protected]>
podaac · Oct 3, 2024 · 2ae0817 · 2ae0817
1 parent 5877298
commit 2ae0817
Show file tree

Hide file tree

Showing 12 changed files with 696 additions and 392 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+    - Issue 211 - Query track ingest table for granules with "to_ingest" status
 ### Changed
 ### Deprecated
 ### Removed

diff --git a/docs/examples.md b/docs/examples.md
@@ -266,193 +266,6 @@ Will return GeoJSON:
             }
         ]
         }
-}
-```
-
-** geometry simplified for example
-
-## Get time series GeoJSON for river node
-
-Search for a single river node by ID.
-
-[https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?feature=Node&feature_id=12228200110861&start_time=2024-01-25T00:00:00Z&end_time=2024-03-30T00:00:00Z&output=geojson&fields=reach_id,node_id,time_str,wse](https://soto.podaac.earthdatacloud.nasa.gov/hydrocron/v1/timeseries?feature=Node&feature_id=12228200110861&start_time=2024-01-25T00:00:00Z&end_time=2024-03-30T00:00:00Z&output=geojson&fields=reach_id,node_id,time_str,wse)
-
-Will return GeoJSON:
-
-```json
-{
-"status": "200 OK",
-"time": 604.705,
-"hits": 9,
-"results": {
-    "csv": "",
-    "geojson": {
-        "type": "FeatureCollection",
-        "features": [
-            {
-            "id": "0",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "2024-01-30T21:19:19Z",
-                "wse": "677.9232",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "1",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "2024-02-06T08:37:09Z",
-                "wse": "673.46918",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "2",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "no_data",
-                "wse": "-999999999999.0",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "3",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "2024-02-20T18:04:24Z",
-                "wse": "673.69799",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "4",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "2024-02-27T05:22:15Z",
-                "wse": "674.66235",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "5",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "no_data",
-                "wse": "-999999999999.0",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "6",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "2024-03-12T14:49:26Z",
-                "wse": "673.47788",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "7",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "2024-03-19T02:07:17Z",
-                "wse": "675.23219",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            },
-            {
-            "id": "8",
-            "type": "Feature",
-            "properties": {
-                "reach_id": "12228200111",
-                "node_id": "12228200110861",
-                "time_str": "no_data",
-                "wse": "-999999999999.0",
-                "wse_units": "m"
-            },
-            "geometry": {
-                "type": "Point",
-                "coordinates": [
-                35.149314,
-                -10.256285
-                ]
-            }
-            }
-        ]
-        }
     }
 }
 ```

diff --git a/hydrocron/api/data_access/db.py b/hydrocron/api/data_access/db.py
@@ -76,6 +76,44 @@ def get_prior_lake_series_by_feature_id(self, feature_id, start_time, end_time):
         )
         return items
 
+    def get_series_granule_ur(self, table_name, feature_name, granule_ur):
+        """
+
+        @param table_name: str - Hydrocron table to query
+        @param granule_ur: str - Granule UR
+        @return: dictionary of items
+        """
+
+        hydrocron_table = self._dynamo_instance.Table(table_name)
+        hydrocron_table.load()
+
+        items = hydrocron_table.query(
+            ProjectionExpression=feature_name,
+            IndexName="GranuleURIndex",
+            KeyConditionExpression=(
+                Key("granuleUR").eq(granule_ur)
+            )
+        )
+        last_key_evaluated = ""
+        if "LastEvaluatedKey" in items.keys():
+            last_key_evaluated = items["LastEvaluatedKey"]
+
+        while last_key_evaluated:
+            next_items = hydrocron_table.query(
+                ExclusiveStartKey=last_key_evaluated,
+                ProjectionExpression=feature_name,
+                IndexName="GranuleURIndex",
+                KeyConditionExpression=(
+                    Key("granuleUR").eq(granule_ur)
+                )
+            )
+            items["Items"].extend(next_items["Items"])
+            last_key_evaluated = ""
+            if "LastEvaluatedKey" in next_items.keys():
+                last_key_evaluated = next_items["LastEvaluatedKey"]
+
+        return items["Items"]
+
     def get_granule_ur(self, table_name, granule_ur):
         """
 
@@ -96,3 +134,32 @@ def get_granule_ur(self, table_name, granule_ur):
             )
         )
         return items
+
+    def get_status(self, table_name, status):
+        """
+
+        @param table_name: str - Hydrocron table to query
+        @param status: str - Status to query for
+        """
+
+        hydrocron_table = self._dynamo_instance.Table(table_name)
+        items = hydrocron_table.query(
+            IndexName="statusIndex",
+            KeyConditionExpression=(Key("status").eq(status))
+        )
+        last_key_evaluated = ""
+        if "LastEvaluatedKey" in items.keys():
+            last_key_evaluated = items["LastEvaluatedKey"]
+
+        while last_key_evaluated:
+            next_items = hydrocron_table.query(
+                ExclusiveStartKey=last_key_evaluated,
+                IndexName="statusIndex",
+                KeyConditionExpression=(Key("status").eq(status))
+            )
+            items["Items"].extend(next_items["Items"])
+            last_key_evaluated = ""
+            if "LastEvaluatedKey" in next_items.keys():
+                last_key_evaluated = next_items["LastEvaluatedKey"]
+
+        return items["Items"]