Skip to content

Commit

Permalink
Merge pull request #1097 from xcube-dev/forman-1093-preload_api
Browse files Browse the repository at this point in the history
New data store preload API
  • Loading branch information
forman authored Dec 27, 2024
2 parents aa4f430 + 647e107 commit 3473d04
Show file tree
Hide file tree
Showing 11 changed files with 1,007 additions and 9 deletions.
8 changes: 8 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

### Enhancements

* Added a new _preload API_ to xcube data stores:
- Enhanced the `xcube.core.store.DataStore` class to optionally support
preloading of datasets via an API represented by the
new `xcube.core.store.DataPreloader` interface.
- Added handy default implementations `NullPreloadHandle` and `ExecutorPreloadHandle`
to be returned by implementations of the `prepare_data()` method of a
given data store.

* A `xy_res` keyword argument was added to the `transform()` method of
`xcube.core.gridmapping.GridMapping`, enabling users to set the grid-mapping
resolution directly, which speeds up the method by avoiding time-consuming
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ dependencies:
- s3fs >=2021.6
- setuptools >=41.0
- shapely >=1.6
- tabulate >=0.9
- tornado >=6.0
- urllib3 >=1.26
- xarray >=2022.6
Expand Down
368 changes: 368 additions & 0 deletions examples/notebooks/datastores/preload.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,368 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7f55e16a-46f3-4865-879a-a7cae151daa6",
"metadata": {},
"source": [
"### `ExecutorPreloadHandle` Demo\n",
"\n",
"This notebook is dedicated to developers wanting to enhance their\n",
"data store implementation by the new _data store preload API_. \n",
"This API has been added to `xcube.core.store.DataStore` in xcube 1.8.\n",
"Demonstrated here is the usage of the utility class ` xcube.core.store.preload.ExecutorPreloadHandle` \n",
"for cases where the preload process can be concurrently performed for each indiviual data resource. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7e27d98a-6519-4c4f-9e59-ae6369e102b9",
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"import time\n",
"\n",
"from xcube.core.store.preload import ExecutorPreloadHandle\n",
"from xcube.core.store.preload import PreloadHandle\n",
"from xcube.core.store.preload import PreloadState\n",
"from xcube.core.store.preload import PreloadStatus"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "92b57a80-14aa-4820-8acc-ab78fa45be0a",
"metadata": {},
"outputs": [],
"source": [
"data_ids = (\n",
" \"tt-data/tinky-winky.nc\", \n",
" \"tt-data/dipsy.zarr\", \n",
" \"tt-data/laa-laa.tiff\", \n",
" \"tt-data/po.zarr.zip\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c3175fc6-084d-4da5-b7af-fe3f03758a7d",
"metadata": {},
"outputs": [],
"source": [
"def preload_data(handle: PreloadHandle, data_id: str):\n",
" duration = random.randint(5, 15) # seconds\n",
" num_ticks = 100 \n",
" for i in range(num_ticks):\n",
" time.sleep(duration / num_ticks)\n",
" if handle.cancelled:\n",
" # TODO: Note clear, why future.cancel() doesn't do the job\n",
" handle.notify(PreloadState(data_id, status=PreloadStatus.cancelled))\n",
" return\n",
" handle.notify(PreloadState(data_id, progress=i / num_ticks))\n",
" if i % 10 == 0:\n",
" handle.notify(PreloadState(data_id, message=f\"Step #{i // 10 + 1}\"))\n",
" handle.notify(PreloadState(data_id, progress=1.0, message=\"Done.\"))"
]
},
{
"cell_type": "markdown",
"id": "48792654-20e5-46db-aa47-37c03f46b0a7",
"metadata": {},
"source": [
"---\n",
"Synchronous / blocking call"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c8bb6812-af18-4e5f-8df0-d77889552e99",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "f88a78a4ffff41cd917f73405a1183e6",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(HTML(value='<table>\\n<thead>\\n<tr><th>Data ID </th><th>Status </th><th>Progress …"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"handle = ExecutorPreloadHandle(data_ids, preload_data=preload_data)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8911ae1b-765c-4030-aeca-fc997e7a880d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<thead>\n",
"<tr><th>Data ID </th><th>Status </th><th>Progress </th><th>Message </th><th>Exception </th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>tt-data/tinky-winky.nc</td><td>STOPPED </td><td>100% </td><td>Done. </td><td>- </td></tr>\n",
"<tr><td>tt-data/dipsy.zarr </td><td>STOPPED </td><td>100% </td><td>Done. </td><td>- </td></tr>\n",
"<tr><td>tt-data/laa-laa.tiff </td><td>STOPPED </td><td>100% </td><td>Done. </td><td>- </td></tr>\n",
"<tr><td>tt-data/po.zarr.zip </td><td>STOPPED </td><td>100% </td><td>Done. </td><td>- </td></tr>\n",
"</tbody>\n",
"</table>"
],
"text/plain": [
"Data ID Status Progress Message Exception\n",
"---------------------- -------- ---------- --------- -----------\n",
"tt-data/tinky-winky.nc STOPPED 100% Done. -\n",
"tt-data/dipsy.zarr STOPPED 100% Done. -\n",
"tt-data/laa-laa.tiff STOPPED 100% Done. -\n",
"tt-data/po.zarr.zip STOPPED 100% Done. -"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"handle"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5fe9a8e6-1543-469f-bb29-f1c7b3462953",
"metadata": {},
"outputs": [],
"source": [
"handle.close()"
]
},
{
"cell_type": "markdown",
"id": "fe45c10a-e20f-439f-85bb-4aa95a68f3d4",
"metadata": {},
"source": [
"---\n",
"Asynchronous / non-blocking call"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "69cb1b3d-3fcc-4c66-b077-d474a0083e6e",
"metadata": {},
"outputs": [],
"source": [
"async_handle = ExecutorPreloadHandle(data_ids, blocking=False, preload_data=preload_data)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "d5a50ec1-addf-46a4-90e1-abfc5b2e95de",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "eec94f7b3cdd4d5e96411f980aa55a92",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(HTML(value='<table>\\n<thead>\\n<tr><th>Data ID </th><th>Status </th><th>Progress …"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"async_handle.show()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7a2f1f9a-9ea0-43b5-82ed-b90a851f938a",
"metadata": {},
"outputs": [],
"source": [
"time.sleep(2)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "436e7042-abf2-4450-a334-eed75f045c37",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"PreloadState(data_id='tt-data/dipsy.zarr', status=PreloadStatus.started, progress=0.17, message='Step #2')"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"async_handle.get_state(\"tt-data/dipsy.zarr\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "07d81416-38e8-4b23-9220-d1f57160d9b6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"data_id=tt-data/dipsy.zarr, status=STARTED, progress=0.17, message=Step #2\n"
]
}
],
"source": [
"print(async_handle.get_state(\"tt-data/dipsy.zarr\"))"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f17aeecb-6650-43d5-9c41-cedb46baaa3f",
"metadata": {},
"outputs": [],
"source": [
"time.sleep(2)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "a911b93a-fb76-4d2f-88e8-0fd4e635246c",
"metadata": {},
"outputs": [],
"source": [
"async_handle.cancel()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "df2f5a5c-82a8-43ae-ae84-5c9d882e3757",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data ID Status Progress Message Exception\n",
"---------------------- -------- ---------- --------- -----------\n",
"tt-data/tinky-winky.nc STARTED 38% Step #4 -\n",
"tt-data/dipsy.zarr STARTED 35% Step #4 -\n",
"tt-data/laa-laa.tiff STARTED 55% Step #6 -\n",
"tt-data/po.zarr.zip STARTED 29% Step #3 -\n"
]
}
],
"source": [
"print(async_handle)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "769857c9-bad8-4824-92fb-a195e4d5ccb3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<thead>\n",
"<tr><th>Data ID </th><th>Status </th><th>Progress </th><th>Message </th><th>Exception </th></tr>\n",
"</thead>\n",
"<tbody>\n",
"<tr><td>tt-data/tinky-winky.nc</td><td>CANCELLED</td><td>38% </td><td>Step #4 </td><td>- </td></tr>\n",
"<tr><td>tt-data/dipsy.zarr </td><td>STARTED </td><td>35% </td><td>Step #4 </td><td>- </td></tr>\n",
"<tr><td>tt-data/laa-laa.tiff </td><td>STARTED </td><td>55% </td><td>Step #6 </td><td>- </td></tr>\n",
"<tr><td>tt-data/po.zarr.zip </td><td>STARTED </td><td>29% </td><td>Step #3 </td><td>- </td></tr>\n",
"</tbody>\n",
"</table>"
],
"text/plain": [
"Data ID Status Progress Message Exception\n",
"---------------------- --------- ---------- --------- -----------\n",
"tt-data/tinky-winky.nc CANCELLED 38% Step #4 -\n",
"tt-data/dipsy.zarr STARTED 35% Step #4 -\n",
"tt-data/laa-laa.tiff STARTED 55% Step #6 -\n",
"tt-data/po.zarr.zip STARTED 29% Step #3 -"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"async_handle"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "4d0d42ee-5caa-4eef-913a-1f0417670001",
"metadata": {},
"outputs": [],
"source": [
"async_handle.close()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5c3d89bf-f58f-495b-a5d1-c9fe0b9610a7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ dependencies = [
"s3fs>=2021.6",
"setuptools>=41.0",
"shapely>=1.6",
"tabulate>=0.9",
"tornado>=6.0",
"urllib3>=1.26",
"xarray>=2022.6,<=2024.6",
Expand Down
Loading

0 comments on commit 3473d04

Please sign in to comment.