Rename reading/writing doc

alxmrs · Aug 5, 2021 · 72191d5 · 72191d5
1 parent 9d8d908
commit 72191d5
Show file tree

Hide file tree

Showing 2 changed files with 21 additions and 19 deletions.
diff --git a/docs/index.md b/docs/index.md
@@ -6,10 +6,12 @@ The documentation includes narrative documentation that will walk you through th
 
 We recommend reading both, as well as a few [end to end examples](https://github.com/google/xarray-beam/tree/main/examples) to understand what code using Xarray-Beam typically looks like.
 
+## Contents
+
 ```{toctree}
 :maxdepth: 1
 data-model.ipynb
-io.ipynb
+read-write.ipynb
 aggregation.ipynb
 rechunking.ipynb
 api.md

diff --git a/docs/io.ipynb → docs/read-write.ipynb b/docs/io.ipynb → docs/read-write.ipynb
@@ -5,15 +5,15 @@
    "id": "8e4f05ea",
    "metadata": {},
    "source": [
-    "# Loading and saving data"
+    "# Reading and writing data"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "480ac360",
    "metadata": {},
    "source": [
-    "## Loading datasets into chunks"
+    "## Read datasets into chunks"
    ]
   },
   {
@@ -27,7 +27,7 @@
   {
    "cell_type": "code",
    "execution_count": 42,
-   "id": "3fec02e8",
+   "id": "5923b201",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -37,7 +37,7 @@
   {
    "cell_type": "code",
    "execution_count": 39,
-   "id": "7b431556",
+   "id": "bc5bfdc0",
    "metadata": {
     "tags": [
      "hide-input"
@@ -100,7 +100,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "25c5f2a6",
+   "id": "2f0e5efb",
    "metadata": {},
    "source": [
     "Importantly, xarray datasets fed into `DatasetToChunks` **can be lazy**, with data not already loaded eagerly into NumPy arrays. When you feed lazy datasets into `DatasetToChunks`, each individual chunk will be indexed and evaluated separately on Beam workers.\n",
@@ -113,7 +113,7 @@
   {
    "cell_type": "code",
    "execution_count": 47,
-   "id": "a2ce5049",
+   "id": "8a0d0091",
    "metadata": {},
    "outputs": [
     {
@@ -149,15 +149,15 @@
   },
   {
    "cell_type": "markdown",
-   "id": "ea3ec245",
+   "id": "de622acb",
    "metadata": {},
    "source": [
     "`chunks=None` tells Xarray to use its builtin lazy indexing machinery, instead of using Dask. This is advantageous because datasets using Xarray's lazy indexing are serialized much more compactly (via [pickle](https://docs.python.org/3/library/pickle.html)) when passed into Beam transforms."
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "1c8bc4bc",
+   "id": "e4dc8c82",
    "metadata": {},
    "source": [
     "Alternatively, you can pass in lazy datasets [using dask](http://xarray.pydata.org/en/stable/user-guide/dask.html). In this case, you don't need to explicitly supply `chunks` to `DatasetToChunks`:"
@@ -166,7 +166,7 @@
   {
    "cell_type": "code",
    "execution_count": 49,
-   "id": "3c86c82e",
+   "id": "b61440aa",
    "metadata": {},
    "outputs": [
     {
@@ -198,7 +198,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "db585d3a",
+   "id": "d73c6398",
    "metadata": {},
    "source": [
     "Dask's lazy evaluation system is much more general than Xarray's lazy indexing, so as long as resulting dataset can be independently evaluated in each chunk this can be a very convenient way to setup computation for Xarray-Beam.\n",
@@ -208,7 +208,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "cdf80b53",
+   "id": "4c4dfd42",
    "metadata": {},
    "source": [
     "```{note}\n",
@@ -221,25 +221,25 @@
    "id": "233809a4",
    "metadata": {},
    "source": [
-    "## Saving data to Zarr"
+    "## Writing data to Zarr"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "67b10192",
+   "id": "2f415ceb",
    "metadata": {},
    "source": [
     "[Zarr](https://zarr.readthedocs.io/) is the preferred file format for reading and writing data with Xarray-Beam, due to its excellent scalability and support inside Xarray.\n",
     "\n",
-    "{py:class}`~xarray_beam.ChunksToZarr` is Xarray-Beam's API for saving chunks into a (new) Zarr store. \n",
+    "{py:class}`~xarray_beam.ChunksToZarr` is Xarray-Beam's API for saving chunks into a Zarr store. \n",
     "\n",
     "You can get started just using it directly:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 50,
-   "id": "88fc081c",
+   "id": "9c6efa33",
    "metadata": {},
    "outputs": [
     {
@@ -257,7 +257,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "70da81a8",
+   "id": "04c0f50b",
    "metadata": {},
    "source": [
     "By default, `ChunksToZarr` needs to evaluate and combine the entire distributed dataset in order to determine overall Zarr metadata (e.g., array names, shapes, dtypes and attributes). This is fine for relatively small datasets, but can entail significant additional communication and storage costs for large datasets.\n",
@@ -270,7 +270,7 @@
   {
    "cell_type": "code",
    "execution_count": 55,
-   "id": "b8ea3f4a",
+   "id": "993191db",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -280,7 +280,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "31748c31",
+   "id": "e70cd961",
    "metadata": {},
    "source": [
     "Xarray operations like indexing and expand dimensions (see {py:meth}`xarray.Dataset.expand_dims`) are entirely lazy on this dataset, which makes it relatively straightforward to build up a Dataset with the required variables and dimensions, e.g., as used in the [ERA5 climatology example](https://github.com/google/xarray-beam/blob/main/examples/era5_climatology.py)."