Skip to content

Commit

Permalink
API for Point and Layer data (#81)
Browse files Browse the repository at this point in the history
* Start creating classes to act as stable api

* get the shapefile filtering working for points. Start on raster code

* Add some tests and comments for the api

* Test the query methods

* Add tests for from area both for polygon and from point and buffer

* start layer tests, not passing yet

* Finish Layer data tests. Add more doc strings. Modify README for testing

* site names attribute

* add two notebooks of examples

* rename notebook

* use plotly for interactive plots

* Working on documentation for the API

* Further documentation of the api methodology

* typo

* fix the box plot in the api example

* errors in documentation

* add date filtering

* Add line about date filtering

* Update readthedocs configuration for updated template schema. Add plotly to docs reqs. Add template for plotly js cdn

* Make sure raster queries are limited

* better list compare

* try using python3 pip

* is setuptools the issue?

* we don't need to install docs reqs for github builds

* more req work

* update dev reqs

* Fix database to use local for testing. I think we have a reqs issue. Now conversions to geopandas are failing locally when the API tests passed in general before

* briefly use the DB for testing

* use sqlalchemy >= 2

* working on test fix

* Install doc reqs

* Can't install sphinx 7.3 on python 3.8

* sphinx in req_dev is getting us

* try getting rid of matplotlib req

* try adding a make clean

* what if we just let pip do the work?

* setup_requires is deprecated. Could move to pyproj.toml, but do we need it?
  • Loading branch information
micah-prime authored Jun 17, 2024
1 parent 01b583f commit 4f8fba4
Show file tree
Hide file tree
Showing 19 changed files with 5,442 additions and 45 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest, macOS-latest]
python-version: [3.7, 3.8, 3.9]
python-version: [3.8, 3.9, '3.10']
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -27,9 +27,9 @@ jobs:

- name: Install Macos/Linux dependencies
run: |
pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
python setup.py install
python3 -m pip install --upgrade pip setuptools wheel
make clean
python3 -m pip install .
- name: Install Validation
run: |
python -c "import snowexsql"
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.7, 3.8, 3.9]
python-version: [3.8, 3.9, '3.10']

services:

Expand Down Expand Up @@ -44,9 +44,9 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install -y postgis gdal-bin
python -m pip install --upgrade pip
pip install pytest coverage
if [ -f requirements_dev.txt ]; then pip install -r requirements_dev.txt; fi
python3 -m pip install --upgrade pip
python3 -m pip install pytest coverage
if [ -f requirements_dev.txt ]; then python3 -m pip install -r requirements_dev.txt; fi
- name: Test with pytest
run: |
pytest -s
Expand Down
26 changes: 15 additions & 11 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
version: 2
version: 2

sphinx:
configuration: docs/conf.py
fail_on_warning: false
sphinx:
configuration: docs/conf.py
fail_on_warning: false

python:
version: 3.8
install:
- requirements: docs/requirements.txt
- requirements: requirements.txt
- method: setuptools
path: .
build:
os: ubuntu-22.04
tools:
python: "3.10"

python:
install:
- requirements: docs/requirements.txt
- requirements: requirements.txt
- method: setuptools
path: .
35 changes: 34 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,14 +74,47 @@ If you are using `conda` you may need to reinstall the following using conda:
* Jupyter notebook
* nbconvert


I want data fast!
-----------------
A programmatic API has been created for fast and standard
access to Point and Layer data. There are two examples_ covering the
features and usage of the api. See the specific api_ documentation for
detailed description.

.. _api: https://snowexsql.readthedocs.io/en/latest/api.html

.. code-block:: python
from snowexsql.api import PointMeasurements, LayerMeasurements
# The main functions we will use are `from_area` and `from_filter` like this
df = PointMeasurements.from_filter(
date=date(2020, 5, 28), instrument='camera'
)
print(df.head())
Tests
-----

Before testing, in a seperate terminal, we need to run a local instance
of the database. This can be done with

.. code-block:: bash
docker-compose up -d
When you are finished testing, make sure to turn the docker off

.. code-block:: bash
docker-compose down
Quickly test your installation by running:

.. code-block:: bash
pytest
python3 -m pytest tests/
The goal of this project is to have high fidelity in data
interpretation/submission to the database. To see the current
Expand Down
6 changes: 6 additions & 0 deletions docs/_templates/layout.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{% extends "!layout.html" %}

{% block footer %}
<script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
{{ super() }}
{% endblock %}
109 changes: 105 additions & 4 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,111 @@
API Documentation
=================
.. role:: python(code)
:language: python

Information on snowexsql functions, classes, and modules.
Background
----------
The API (not a rest API, more of an SDK) is a set of python classes
designed for easy and standardized access to the database data.

.. toctree::
:maxdepth: 4
The classes can both describe what data is available, and return
data in a GeoPandas dataframe.

snowexsql
Components
----------
There are two main API classes for data access.

.. code-block:: python
from snowexsql.api import PointMeasurements, LayerMeasurements
:code:`PointMeasurements` gives access to the PointData (depths, GPR, etc), and
:code:`LayerMeasurements` gives access to the LayerData (pits, etc).

Both of the classes have the same methods, although they access different
tables in the database.

The primary methods for accessing data are :code:`.from_area` and
:code:`.from_filter`. Both of these methods return a GeoPandas dataframe.

.from_filter
------------

The :code:`.from_filter` is the simpler of the two search methods. It takes in
a variety of key word args (kwargs) and returns a dataset that meets
all of the criteria.

.. code-block:: python
df = LayerMeasurements.from_filter(
type="density",
site_name="Boise River Basin",
limit=1000
)
In this example, we filter to all the layer measurements of `density`
that were taken in the `Boise River Basin`, and we `limit` to the top
1000 measurements.

Each kwarg (except date) **can take in a list or a single value** so you could change
this to :code:`site_name=["Boise River Basin", "Grand Mesa"]`

To find what `kwargs` are allowed, we can check the class

.. code-block:: python
LayerMeasurements.ALLOWED_QRY_KWARGS
For :code:`LayerMeasurements` this will return
:code:`["site_name", "site_id", "date", "instrument", "observers", "type", "utm_zone", "pit_id", "date_greater_equal", "date_less_equal"]`

so we can filter by any of these as inputs to the function.

**Notice `limit` is not specified here**. Limit is in the :code:`SPECIAL_KWARGS`
and gets handled at the end of the query.

**Notice `date_greater_equal` and `date_less_equal`** for filtering the `date`
parameter using `>=` and `<=` logic.

To find what values are allowed for each, we can check the propeties of the
class. Both :code:`LayerMeasurements` and :code:`PointMeasurements` have
the following properties.

* all_site_names
* all_types
* all_dates
* all_observers
* all_instruments

So you can find all the instruments for filtering like :code:`LayerMeasurements().all_instruments`.
**Note** - these must be called from an instantiated class like shown earlier
in this line.

.from_area
----------

The signature for :code:`.from_area` looks like this

.. code-block:: python
def from_area(cls, shp=None, pt=None, buffer=None, crs=26912, **kwargs):
It is a class method, so it *does not need an instantiated class*.
The :code:`**kwargs` argument takes the same inputs as the :code:`from_filter`
function.

The big difference is that from area will filter to results either within
:code:`shp` (a `shapely` polygon) **or** within :code:`buffer` radius
around :code:`pt` (a `shapely` point).


Large Query Exception and Limit
-------------------------------

By default, if more than 1000 records will be returned, and **no limit**
is provided. The query will fail. This is intentional so that we are aware
of large queries. If you understand your query will be large and need
more than 1000 records returned, add a :code:`limit` kwarg to your query
with a value greater than the number you need returned.
**This will override the default behavior** and return as many records as
you requested.
Loading

0 comments on commit 4f8fba4

Please sign in to comment.