From dba15fe843baf043a52c01550259caf0c3d988aa Mon Sep 17 00:00:00 2001
From: Andreas Motl <andreas.motl@crate.io>
Date: Tue, 5 Mar 2024 23:42:52 +0100
Subject: [PATCH] Time Series QA: Make notebooks self-contained, also adding
 DDL and DML

Otherwise, people or QA jobs invoking individual notebooks, or in a
different order, are having a hard time.
---
 .../exploratory_data_analysis.ipynb           | 27 ++++++++++++++++++-
 topic/timeseries/requirements-dev.txt         |  4 +--
 topic/timeseries/requirements.txt             |  1 +
 .../time-series-decomposition.ipynb           | 27 ++++++++++++++++++-
 ...timeseries-queries-and-visualization.ipynb |  6 ++---
 5 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/topic/timeseries/exploratory_data_analysis.ipynb b/topic/timeseries/exploratory_data_analysis.ipynb
index 247484c8..2b9615ad 100644
--- a/topic/timeseries/exploratory_data_analysis.ipynb
+++ b/topic/timeseries/exploratory_data_analysis.ipynb
@@ -102,12 +102,37 @@
     "engine = sa.create_engine(CONNECTION_STRING, echo=os.environ.get('DEBUG'))"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "First, import data into CrateDB. This is a shorthand notation for the same code\n",
+    "illustrated in `timeseries-queries-and-visualization.ipynb`, running corresponding\n",
+    "SQL DDL and DML statements, to load the data."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "from cratedb_toolkit.datasets import load_dataset\n",
+    "\n",
+    "dataset = load_dataset(\"tutorial/weather-basic\")\n",
+    "dataset.dbtable(dburi=CONNECTION_STRING, table=\"weather_data\").load()"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
   {
    "cell_type": "markdown",
    "id": "cdae15fa",
    "metadata": {},
    "source": [
-    "The next step fetches data from CrateDB and load it into a pandas data frame:"
+    "Then, load data from CrateDB into a pandas data frame:"
    ]
   },
   {
diff --git a/topic/timeseries/requirements-dev.txt b/topic/timeseries/requirements-dev.txt
index 4f771791..cfd81eee 100644
--- a/topic/timeseries/requirements-dev.txt
+++ b/topic/timeseries/requirements-dev.txt
@@ -1,5 +1,5 @@
 # Real.
-# pueblo[notebook,testing]>=0.0.7
+pueblo[notebook,testing]>=0.0.9
 
 # Development.
-pueblo[notebook,testing] @ git+https://github.com/pyveci/pueblo.git@amo/testbook
+# pueblo[notebook,testing] @ git+https://github.com/pyveci/pueblo.git@amo/testbook
diff --git a/topic/timeseries/requirements.txt b/topic/timeseries/requirements.txt
index bbc66e95..a75b6aa2 100644
--- a/topic/timeseries/requirements.txt
+++ b/topic/timeseries/requirements.txt
@@ -1,4 +1,5 @@
 crate[sqlalchemy]==0.34.0
+cratedb-toolkit==0.0.6
 refinitiv-data<1.7
 pandas<2
 pycaret>=3.0,<3.4
diff --git a/topic/timeseries/time-series-decomposition.ipynb b/topic/timeseries/time-series-decomposition.ipynb
index c6e88764..71a051e6 100644
--- a/topic/timeseries/time-series-decomposition.ipynb
+++ b/topic/timeseries/time-series-decomposition.ipynb
@@ -106,12 +106,37 @@
     "engine = sa.create_engine(CONNECTION_STRING, echo=os.environ.get('DEBUG'))"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "First, import data into CrateDB. This is a shorthand notation for the same code\n",
+    "illustrated in `timeseries-queries-and-visualization.ipynb`, running corresponding\n",
+    "SQL DDL and DML statements, to load the data."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "from cratedb_toolkit.datasets import load_dataset\n",
+    "\n",
+    "dataset = load_dataset(\"tutorial/weather-basic\")\n",
+    "dataset.dbtable(dburi=CONNECTION_STRING, table=\"weather_data\").load()"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
   {
    "cell_type": "markdown",
    "id": "cdae15fa",
    "metadata": {},
    "source": [
-    "The next step fetches data from CrateDB and load it into a pandas data frame:"
+    "Then, load data from CrateDB into a pandas data frame:"
    ]
   },
   {
diff --git a/topic/timeseries/timeseries-queries-and-visualization.ipynb b/topic/timeseries/timeseries-queries-and-visualization.ipynb
index 41b6dc19..ca9f1ed0 100644
--- a/topic/timeseries/timeseries-queries-and-visualization.ipynb
+++ b/topic/timeseries/timeseries-queries-and-visualization.ipynb
@@ -200,9 +200,9 @@
    "id": "226e67f8",
    "metadata": {},
    "source": [
-    "After inserting data, it is recommended to `ANALYZE` the tables to make the query optimizer obtain\n",
-    "important statistics information about them. Let's also invoke a `REFRESH` statement beforehand,\n",
-    "to make sure that the data is up-to-date."
+    "After inserting data, let's invoke a `REFRESH` statement, to make sure it is\n",
+    "up-to-date. It is also recommended to `ANALYZE` the tables, to make the query\n",
+    "optimizer obtain important statistics information about them."
    ]
   },
   {