diff --git a/README.md b/README.md index 90257b0..54497a8 100644 --- a/README.md +++ b/README.md @@ -19,367 +19,43 @@ odata client for Copernicus Data Space catalog ## Very first steps -### Initialize your code - -1. Initialize `git` inside your repo: - +### step 1: create a config file + +* create a copy of the config.yml called localconfig.yml where `cdsodatacli` is installed in your python environement. ```bash -cd cdsodatacli && git init -``` - -2. If you don't have `Poetry` installed run: - +cp config.yml localconfig.yml +``` + * edit the localconfig.yml to set your own path for output directories and CDSE accounts ```bash -make poetry-download -``` + vi localconfig.yml + ``` -3. Initialize poetry and install `pre-commit` hooks: +### step 2: do a query on CDSE Odata API ```bash -make install -make pre-commit-install +queryCDS -h ``` -4. Run the codestyle: +or use the method `cdsodatacli.query.fetch_data()` -```bash -make codestyle -``` - -5. Upload initial code to GitHub: +### step 3: download a listing of product ```bash -git add . -git commit -m ":tada: Initial commit" -git branch -M main -git remote add origin https://github.com/umr-lops/cdsodatacli.git -git push -u origin main +downloadFromCDS -h ``` -### Set up bots - -- Set up [Dependabot](https://docs.github.com/en/github/administering-a-repository/enabling-and-disabling-version-updates#enabling-github-dependabot-version-updates) to ensure you have the latest dependencies. -- Set up [Stale bot](https://github.com/apps/stale) for automatic issue closing. - -### Poetry - -Want to know more about Poetry? Check [its documentation](https://python-poetry.org/docs/). - -
-Details about Poetry -

- -Poetry's [commands](https://python-poetry.org/docs/cli/#commands) are very intuitive and easy to learn, like: - -- `poetry add numpy@latest` -- `poetry run pytest` -- `poetry publish --build` - -etc -

-
- -### Building and releasing your package - -Building a new version of the application contains steps: - -- Bump the version of your package `poetry version `. You can pass the new version explicitly, or a rule such as `major`, `minor`, or `patch`. For more details, refer to the [Semantic Versions](https://semver.org/) standard. -- Make a commit to `GitHub`. -- Create a `GitHub release`. -- And... publish πŸ™‚ `poetry publish --build` - -## 🎯 What's next - -Well, that's up to you πŸ’ͺ🏻. I can only recommend the packages and articles that helped me. - -- [`Typer`](https://github.com/tiangolo/typer) is great for creating CLI applications. -- [`Rich`](https://github.com/willmcgugan/rich) makes it easy to add beautiful formatting in the terminal. -- [`Pydantic`](https://github.com/samuelcolvin/pydantic/) – data validation and settings management using Python type hinting. -- [`Loguru`](https://github.com/Delgan/loguru) makes logging (stupidly) simple. -- [`tqdm`](https://github.com/tqdm/tqdm) – fast, extensible progress bar for Python and CLI. -- [`IceCream`](https://github.com/gruns/icecream) is a little library for sweet and creamy debugging. -- [`orjson`](https://github.com/ijl/orjson) – ultra fast JSON parsing library. -- [`Returns`](https://github.com/dry-python/returns) makes you function's output meaningful, typed, and safe! -- [`Hydra`](https://github.com/facebookresearch/hydra) is a framework for elegantly configuring complex applications. -- [`FastAPI`](https://github.com/tiangolo/fastapi) is a type-driven asynchronous web framework. - -Articles: - -- [Open Source Guides](https://opensource.guide/). -- [A handy guide to financial support for open source](https://github.com/nayafia/lemonade-stand) -- [GitHub Actions Documentation](https://help.github.com/en/actions). -- Maybe you would like to add [gitmoji](https://gitmoji.carloscuesta.me/) to commit names. This is really funny. πŸ˜„ - -## πŸš€ Features - -### Development features - -- Supports for `Python 3.9` and higher. -- [`Poetry`](https://python-poetry.org/) as the dependencies manager. See configuration in [`pyproject.toml`](https://github.com/umr-lops/cdsodatacli/blob/main/pyproject.toml) and [`setup.cfg`](https://github.com/umr-lops/cdsodatacli/blob/main/setup.cfg). -- Automatic codestyle with [`black`](https://github.com/psf/black), [`isort`](https://github.com/timothycrosley/isort) and [`pyupgrade`](https://github.com/asottile/pyupgrade). -- Ready-to-use [`pre-commit`](https://pre-commit.com/) hooks with code-formatting. -- Type checks with [`mypy`](https://mypy.readthedocs.io); docstring checks with [`darglint`](https://github.com/terrencepreilly/darglint); security checks with [`safety`](https://github.com/pyupio/safety) and [`bandit`](https://github.com/PyCQA/bandit) -- Testing with [`pytest`](https://docs.pytest.org/en/latest/). -- Ready-to-use [`.editorconfig`](https://github.com/umr-lops/cdsodatacli/blob/main/.editorconfig), [`.dockerignore`](https://github.com/umr-lops/cdsodatacli/blob/main/.dockerignore), and [`.gitignore`](https://github.com/umr-lops/cdsodatacli/blob/main/.gitignore). You don't have to worry about those things. - -### Deployment features - -- `GitHub` integration: issue and pr templates. -- `Github Actions` with predefined [build workflow](https://github.com/umr-lops/cdsodatacli/blob/main/.github/workflows/build.yml) as the default CI/CD. -- Everything is already set up for security checks, codestyle checks, code formatting, testing, linting, docker builds, etc with [`Makefile`](https://github.com/umr-lops/cdsodatacli/blob/main/Makefile#L89). More details in [makefile-usage](#makefile-usage). -- [Dockerfile](https://github.com/umr-lops/cdsodatacli/blob/main/docker/Dockerfile) for your package. -- Always up-to-date dependencies with [`@dependabot`](https://dependabot.com/). You will only [enable it](https://docs.github.com/en/github/administering-a-repository/enabling-and-disabling-version-updates#enabling-github-dependabot-version-updates). -- Automatic drafts of new releases with [`Release Drafter`](https://github.com/marketplace/actions/release-drafter). You may see the list of labels in [`release-drafter.yml`](https://github.com/umr-lops/cdsodatacli/blob/main/.github/release-drafter.yml). Works perfectly with [Semantic Versions](https://semver.org/) specification. - -### Open source community features - -- Ready-to-use [Pull Requests templates](https://github.com/umr-lops/cdsodatacli/blob/main/.github/PULL_REQUEST_TEMPLATE.md) and several [Issue templates](https://github.com/umr-lops/cdsodatacli/tree/main/.github/ISSUE_TEMPLATE). -- Files such as: `LICENSE`, `CONTRIBUTING.md`, `CODE_OF_CONDUCT.md`, and `SECURITY.md` are generated automatically. -- [`Stale bot`](https://github.com/apps/stale) that closes abandoned issues after a period of inactivity. (You will only [need to setup free plan](https://github.com/marketplace/stale)). Configuration is [here](https://github.com/umr-lops/cdsodatacli/blob/main/.github/.stale.yml). -- [Semantic Versions](https://semver.org/) specification with [`Release Drafter`](https://github.com/marketplace/actions/release-drafter). - ## Installation ```bash pip install -U cdsodatacli ``` -or install with `Poetry` - -```bash -poetry add cdsodatacli -``` - Then you can run ```bash cdsodatacli --help ``` -or with `Poetry`: - -```bash -poetry run cdsodatacli --help -``` - -### Makefile usage - -[`Makefile`](https://github.com/umr-lops/cdsodatacli/blob/main/Makefile) contains a lot of functions for faster development. - -
-1. Download and remove Poetry -

- -To download and install Poetry run: - -```bash -make poetry-download -``` - -To uninstall - -```bash -make poetry-remove -``` - -

-
- -
-2. Install all dependencies and pre-commit hooks -

- -Install requirements: - -```bash -make install -``` - -Pre-commit hooks coulb be installed after `git init` via - -```bash -make pre-commit-install -``` - -

-
- -
-3. Codestyle -

- -Automatic formatting uses `pyupgrade`, `isort` and `black`. - -```bash -make codestyle - -# or use synonym -make formatting -``` - -Codestyle checks only, without rewriting files: - -```bash -make check-codestyle -``` - -> Note: `check-codestyle` uses `isort`, `black` and `darglint` library - -Update all dev libraries to the latest version using one comand - -```bash -make update-dev-deps -``` - -

-4. Code security -

- -```bash -make check-safety -``` - -This command launches `Poetry` integrity checks as well as identifies security issues with `Safety` and `Bandit`. - -```bash -make check-safety -``` - -

-
- -

-
- -
-5. Type checks -

- -Run `mypy` static type checker - -```bash -make mypy -``` - -

-
- -
-6. Tests with coverage badges -

- -Run `pytest` - -```bash -make test -``` - -

-
- -
-7. All linters -

- -Of course there is a command to ~~rule~~ run all linters in one: - -```bash -make lint -``` - -the same as: - -```bash -make test && make check-codestyle && make mypy && make check-safety -``` - -

-
- -
-8. Docker -

- -```bash -make docker-build -``` - -which is equivalent to: - -```bash -make docker-build VERSION=latest -``` - -Remove docker image with - -```bash -make docker-remove -``` - -More information [about docker](https://github.com/umr-lops/cdsodatacli/tree/main/docker). - -

-
- -
-9. Cleanup -

-Delete pycache files - -```bash -make pycache-remove -``` - -Remove package build - -```bash -make build-remove -``` - -Delete .DS_STORE files - -```bash -make dsstore-remove -``` - -Remove .mypycache - -```bash -make mypycache-remove -``` - -Or to remove all above run: - -```bash -make cleanup -``` - -

-
- -## πŸ“ˆ Releases - -You can see the list of available releases on the [GitHub Releases](https://github.com/umr-lops/cdsodatacli/releases) page. - -We follow [Semantic Versions](https://semver.org/) specification. - -We use [`Release Drafter`](https://github.com/marketplace/actions/release-drafter). As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when you’re ready. With the categories option, you can categorize pull requests in release notes using labels. - -### List of labels and corresponding titles - -| **Label** | **Title in Releases** | -| :-----------------------------------: | :---------------------: | -| `enhancement`, `feature` | πŸš€ Features | -| `bug`, `refactoring`, `bugfix`, `fix` | πŸ”§ Fixes & Refactoring | -| `build`, `ci`, `testing` | πŸ“¦ Build System & CI/CD | -| `breaking` | πŸ’₯ Breaking Changes | -| `documentation` | πŸ“ Documentation | -| `dependencies` | ⬆️ Dependencies updates | - -You can update it in [`release-drafter.yml`](https://github.com/umr-lops/cdsodatacli/blob/main/.github/release-drafter.yml). - -GitHub creates the `bug`, `enhancement`, and `documentation` labels for you. Dependabot creates the `dependencies` label. Create the remaining labels on the Issues tab of your GitHub repository, when you need them. - ## πŸ›‘ License [![License](https://img.shields.io/github/license/umr-lops/cdsodatacli)](https://github.com/umr-lops/cdsodatacli/blob/main/LICENSE) diff --git a/cdsodatacli/__main__.py b/cdsodatacli/__main__.py deleted file mode 100644 index 8bed123..0000000 --- a/cdsodatacli/__main__.py +++ /dev/null @@ -1,68 +0,0 @@ -# type: ignore[attr-defined] -from typing import Optional - -from enum import Enum -from random import choice - -import typer -from rich.console import Console - -from cdsodatacli import version -from cdsodatacli.example import hello -from cdsodatacli.query import fetch_data - - -class Color(str, Enum): - white = "white" - red = "red" - cyan = "cyan" - magenta = "magenta" - yellow = "yellow" - green = "green" - - -app = typer.Typer( - name="cdsodatacli", - help="odata client for Copernicus Data Space catalog", - add_completion=False, -) -console = Console() - - -def version_callback(print_version: bool) -> None: - """Print the version of the package.""" - if print_version: - console.print(f"[yellow]cdsodatacli[/] version: [bold blue]{version}[/]") - raise typer.Exit() - - -@app.command(name="") -def main( - name: str = typer.Option(..., help="Person to greet."), - color: Optional[Color] = typer.Option( - None, - "-c", - "--color", - "--colour", - case_sensitive=False, - help="Color for print. If not specified then choice will be random.", - ), - print_version: bool = typer.Option( - None, - "-v", - "--version", - callback=version_callback, - is_eager=True, - help="Prints the version of the cdsodatacli package.", - ), -) -> None: - """Print a greeting with a giving name.""" - if color is None: - color = choice(list(Color)) - - greeting: str = hello(name) - console.print(f"[bold {color}]{greeting}[/]") - - -if __name__ == "__main__": - app() diff --git a/cdsodatacli/download.py b/cdsodatacli/download.py index 5500c3e..ea8f77a 100644 --- a/cdsodatacli/download.py +++ b/cdsodatacli/download.py @@ -132,19 +132,21 @@ def CDS_Odata_download_one_product_v2( if status == 200: # means OK download speed = total_length / elapsed_time shutil.move(output_filepath_tmp, output_filepath) + os.chmod(output_filepath,mode=0o0775) logging.debug("time to download this product: %1.1f sec", elapsed_time) logging.debug("average download speed: %1.1fMo/sec", speed) return speed, status_meaning, safename_base, semaphore_token_file -def filter_product_already_present(cpt, df, outputdir): +def filter_product_already_present(cpt, df, outputdir,force_download=False): """ Parameters ---------- - cpt - df - outputdir + cpt (collections.defaultdict(int)) + df (pd.DataFrame) + outputdir (str) + force_download (bool) Returns ------- @@ -154,12 +156,17 @@ def filter_product_already_present(cpt, df, outputdir): all_urls_to_download = [] index_to_download = [] for ii, safename_product in enumerate(df["safe"]): + to_download = False + if force_download: + to_download = True if test_safe_archive(safename=safename_product): cpt["archived_product"] += 1 elif test_safe_spool(safename=safename_product): cpt["in_spool_product"] += 1 else: + to_download = True cpt["product_absent_from_local_disks"] += 1 + if to_download: index_to_download.append(ii) id_product = df["id"].iloc[ii] url_product = conf["URL_download"] % id_product @@ -180,8 +187,14 @@ def filter_product_already_present(cpt, df, outputdir): return df_todownload, cpt + def download_list_product_multithread_v2( - list_id, list_safename, outputdir, hideProgressBar=False, account_group="logins" + list_id, + list_safename, + outputdir, + hideProgressBar=False, + account_group="logins", + check_on_disk=True, ): """ v2 is handling multi account round-robin and token semaphore files @@ -190,13 +203,16 @@ def download_list_product_multithread_v2( list_id (list) list_safename (list) outputdir (str) - hideProgressBar (bool) + hideProgressBar (bool): True -> no tqdm progress bar in stdout + account_group (str) + check_on_disk (bool): True -> if the product is in the spool dir or in archive dir the download is skipped Returns ------- - + df2 (pd.DataFrame): """ assert len(list_id) == len(list_safename) + logging.info("check_on_disk : %s", check_on_disk) cpt = defaultdict(int) cpt["products_in_initial_listing"] = len(list_id) @@ -207,7 +223,9 @@ def download_list_product_multithread_v2( df = pd.DataFrame( {"safe": list_safename, "status": np.zeros(len(list_safename)), "id": list_id} ) - df2, cpt = filter_product_already_present(cpt, df, outputdir) + + df2, cpt = filter_product_already_present(cpt, df, outputdir,force_download=check_on_disk==False) + logging.info("%s", cpt) while_loop = 0 blacklist = [] diff --git a/cdsodatacli/query.py b/cdsodatacli/query.py index 290cf01..fb06080 100644 --- a/cdsodatacli/query.py +++ b/cdsodatacli/query.py @@ -25,7 +25,7 @@ import traceback import warnings - +DEFAULT_TOP_ROWS_PER_QUERY = 1000 def fetch_data( gdf, date=None, @@ -305,7 +305,7 @@ def create_urls(gdf, top=None): urlapi = "https://catalogue.dataspace.copernicus.eu/odata/v1/Products?$filter=" urls = [] if top is None: - top = 1000 + top = DEFAULT_TOP_ROWS_PER_QUERY for row in range(len(gdf)): gdf_row = gdf.iloc[row] # enter_index = gdf.index[row] @@ -444,6 +444,8 @@ def fetch_one_url(url, cpt, index, cache_dir): # collected_data_x.append(collected_data) cpt["product_proposed_by_CDS"] += len(collected_data["Name"]) collected_data["id_original_query"] = index + if len(collected_data)==DEFAULT_TOP_ROWS_PER_QUERY: + logging.warning("%i products found in a single CDSE OData query (maximum is %s): make sure the timedelta_slice parameters is small enough to avoid truncated results",len(collected_data),DEFAULT_TOP_ROWS_PER_QUERY) if pd.isna(collected_data["Name"]).any(): raise Exception("Name field contains NaN") cpt["answer_append"] += 1 diff --git a/tests_metiers/inventories_CDS_figures.py b/tests_metiers/inventories_CDS_figures.py new file mode 100644 index 0000000..75447e7 --- /dev/null +++ b/tests_metiers/inventories_CDS_figures.py @@ -0,0 +1,364 @@ +import cartopy +import cartopy.crs as ccrs +import pandas as pd +import numpy as np +import datetime +import logging +import collections +from shapely.ops import unary_union +import cartopy.feature as cfeature +from matplotlib import pyplot as plt +from shapely.geometry import Point, Polygon + + +def map_footprints(geometry_request, collected_data_norm, title): + """ + + :param geometry_request (pd.Serie): for buoys location + :param collected_data_norm (pd.DataFrame): CDSE OData output where are the SAR footprints + :param title (str): + :return: + """ + cpt = collections.defaultdict(int) + fig = plt.figure(figsize=(15, 12), dpi=200) + ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree()) + # ax.set_extent([160, 175, -26, -15]) + if isinstance(geometry_request[0], Point): + for uu in geometry_request: + plt.plot(*uu.xy, "r*", ms=3) + else: + for poly in geometry_request: + if not isinstance(poly, Polygon): + print("alerte", poly) + plt.plot(*poly.exterior.xy, "--", lw=2) + ax.add_feature(cfeature.LAND) + ax.add_feature(cfeature.COASTLINE) + ax.gridlines(draw_labels=True, dms=True, x_inline=False, y_inline=False) + cpt_multipolygon = 0 + for uu in collected_data_norm["geometry"]: + # print(uu) + if uu.geom_type == "MultiPolygon": + # print('multi polygon',uu) + cpt_multipolygon += 1 + # dslmfkdmkfsl + try: + plt.plot(*unary_union(uu.geoms).exterior.xy) + except: + cpt["unary_union_not_sufficient"] += 1 + pass + # print('unaryunion passe pas') + elif uu.geom_type=='Polygon': + plt.plot(*uu.exterior.xy, "b--", lw=0.7, alpha=0.8) + else: + print('strange geometry',uu) + # plt.title('Sentinel-1 IW SLC available products %s since 2014'%(len(collected_data_norm['Name'])),fontsize=22) + plt.title(title) + plt.show() + logging.info("counter: %s", cpt) + print( + "cpt_multipolygon", cpt_multipolygon, "/", len(collected_data_norm["geometry"]) + ) + + +def histogram_ocean_coverage(collected_data_norm, title): + plt.figure(dpi=100) + plt.title(title) + plt.hist( + collected_data_norm["sea_percent"], + 20, + label="CDS results total:%s" % len(collected_data_norm["sea_percent"]), + edgecolor="k", + ) + plt.xlabel("% sea in footprint of slices found") + plt.ylabel("number of slices found") + plt.legend() + plt.grid(True) + plt.show() + + +def add_time_index_based_onstardtate(collected_data_norm): + hm = [] + for uu in collected_data_norm["Name"].values: + if isinstance(uu, str): + hm.append(datetime.datetime.strptime(uu.split("_")[5], "%Y%m%dT%H%M%S")) + else: + hm.append(np.nan) + collected_data_norm["startdate"] = hm + collected_data_norm = collected_data_norm.set_index("startdate") + return collected_data_norm + + +def add_orientation_pass_column(collected_data_norm): + all_pass = [] + for kk in collected_data_norm["Attributes"]: + for ll in kk: + if ll["Name"] == "orbitDirection": + all_pass.append(ll["Value"]) + collected_data_norm["pass"] = all_pass + return collected_data_norm + + +def number_product_per_month(collected_data_norm, title): + collected_data_norm = add_time_index_based_onstardtate(collected_data_norm) + cummul_grp = None + freq = "M" + ix = pd.date_range( + start=datetime.datetime(2013, 1, 1), + end=datetime.datetime(2024, 1, 1), + freq=freq, + ) + plt.figure(figsize=(15, 6), dpi=110) + for unit in ["S1A", "S1B"]: + for pol in ["1SDV", "1SSV", "1SSH", "1SDH"]: + subset = collected_data_norm[ + (collected_data_norm["Name"].str.contains(unit)) + & (collected_data_norm["Name"].str.contains(pol)) + ] + grp = subset.groupby(pd.Grouper(freq=freq)).count() + grp = grp.reindex(ix) + # if len(grp['Name'])>0: + if grp["Name"].sum() > 0: + # grp['Name'].plot(marker='.',label='%s %s'%(unit,pol),ms=12,markeredgecolor='k',lw=0.7) + plt.bar( + grp.index, + grp["Name"], + width=30, + label="%s %s : %s products" % (unit, pol, len(subset)), + bottom=cummul_grp, + edgecolor="k", + ) + # cumsum += grp['Name'] + if cummul_grp is None: + cummul_grp = grp["Name"].fillna(0) + else: + # cummul_grp = pd.merge([cummul_grp,grp['Name']]) + # cummul_grp.merge(grp) + cummul_grp += grp["Name"].fillna(0) + plt.legend(fontsize=15, loc=2) + plt.grid(True) + plt.title(title, fontsize=18) + plt.yticks(fontsize=18) + plt.xticks(fontsize=18,rotation=45) + plt.xlim(ix[0],ix[-1]) + plt.ylabel("Number of IW SLC products available\nstacked histogram", fontsize=17) + plt.show() + + +def number_of_product_per_climato_month(collected_data_norm, title): + collected_data_norm = add_time_index_based_onstardtate(collected_data_norm) + plt.figure(figsize=(13, 6), dpi=110) + for unit in ["S1A", "S1B"]: + for pol in ["1SDV", "1SSV", "1SSH", "1SDH"]: + subset = collected_data_norm[ + (collected_data_norm["Name"].str.contains(unit)) + & (collected_data_norm["Name"].str.contains(pol)) + ] + grp = subset.groupby(subset.index.month).count() + # print('grp',grp) + if len(grp["Name"]) > 0: + grp["Name"].plot( + marker=".", + label="%s %s" % (unit, pol), + ms=12, + markeredgecolor="k", + lw=0.7, + ) + plt.legend(fontsize=15,bbox_to_anchor=(1,1)) + plt.grid(True) + plt.title(title, fontsize=18) + plt.yticks(fontsize=18) + plt.xticks( + np.arange(1, 13), + [ + "Jan", + "Fev", + "Mar", + "Apr", + "May", + "Jun", + "Jul", + "Aug", + "Sep", + "Oct", + "Nov", + "Dev", + ], + fontsize=12, + ) + plt.xlabel("month of the year", fontsize=18) + plt.ylabel("Number of product", fontsize=18) + plt.show() + + +def number_of_product_per_year_asc_desc(collected_data_norm, title): + collected_data_norm = add_time_index_based_onstardtate(collected_data_norm) + collected_data_norm = add_orientation_pass_column(collected_data_norm) + plt.figure(figsize=(10, 6), dpi=110) + cumsum = None + cummul_grp = None + freq = "AS" + colors = ["brown", "grey", "cyan", "magenta"] + ix = pd.date_range( + start=datetime.datetime(2013, 1, 1), + end=datetime.datetime(2024, 1, 1), + freq=freq, + ) + + cptu = 0 + for unit in ["S1A", "S1B"]: + for passe in ["ASCENDING", "DESCENDING"]: + subset = collected_data_norm[ + (collected_data_norm["Name"].str.contains(unit)) + & (collected_data_norm["pass"].str.contains(passe)) + ] + grp = subset.groupby(pd.Grouper(freq=freq)).count() + grp = grp.reindex(ix) + if cumsum is None: + cumsum = np.zeros(grp["Name"].size) + + if len(grp["Name"]) > 0: + # grp['Name'].plot(marker='.',label='%s %s'%(unit,passe),ms=12,markeredgecolor='k',lw=0.7) + plt.bar( + grp.index, + grp["Name"], + width=300, + label="%s %s: %s products" % (unit, passe, len(subset)), + bottom=cummul_grp, + edgecolor="k", + fc=colors[cptu], + ) + else: + print("riend ans ce group") + cumsum += grp["Name"] + if cummul_grp is None: + cummul_grp = grp["Name"].fillna(0) + else: + # cummul_grp = pd.merge([cummul_grp,grp['Name']]) + # cummul_grp.merge(grp) + cummul_grp += grp["Name"].fillna(0) + # print('cummul_grp',cummul_grp) + cptu += 1 + plt.legend(fontsize=12, bbox_to_anchor=(1,1)) + plt.grid(True) + plt.title(title, fontsize=18) + plt.yticks(fontsize=12) + plt.xticks(fontsize=12) + plt.ylabel("Number of IW SLC products available\nstacked histogram", fontsize=15) + plt.show() + + +def add_volumetry_column(collected_data_norm): + """ + for IW it i s True + :param collected_data_norm: + :return: + """ + vols = [] + for kk in collected_data_norm["Name"]: + if "EW" in kk or "WV" in kk: + raise Exception("mode no configured") + if "1SDV" in kk or "1SDH" in kk: + vols.append(7.8/1000.) + else: + vols.append(3.8/1000.) + collected_data_norm["volume"] = vols + return collected_data_norm + + +def volume_per_year(collected_data_norm, title,freq = "AS"): + """ + + :param collected_data_norm: + :param title: + :param freq: AS is for yearly grouping with anchor at the start of the year + :return: + """ + collected_data_norm = add_volumetry_column(collected_data_norm) + collected_data_norm = add_time_index_based_onstardtate(collected_data_norm) + plt.figure(figsize=(10, 6), dpi=110) + cummul_grp = None + # not Y because anchored date is offset to year+1 + # freq = "M" # for a test + if freq == "AS": + width = 300 + elif freq == "M": + width = 30 + + ix = pd.date_range( + start=datetime.datetime(2013, 1, 1), + end=datetime.datetime(2024, 1, 1), + freq=freq, + ) + cptu = 0 + for unit in ["S1A", "S1B"]: + for pol in ["1SDV", "1SSV", "1SSH", "1SDH"]: + subset = collected_data_norm[ + (collected_data_norm["Name"].str.contains(unit)) + & (collected_data_norm["Name"].str.contains(pol)) + ] + subset = subset["volume"] + grp = subset.groupby(pd.Grouper(freq=freq)).sum() + grp = grp.reindex(ix) + if grp.sum() > 0: + plt.bar( + grp.index, + grp, + width=width, + label="%s %s: %s products volume: %1.1f To" + % (unit, pol, len(subset), subset.sum()), + bottom=cummul_grp, + edgecolor="k", + ) + else: + print("riend ans ce group") + if cummul_grp is None: + cummul_grp = grp.fillna(0) + else: + cummul_grp += grp.fillna(0) + cptu += 1 + plt.legend(fontsize=10, loc=2) + plt.grid(True) + plt.title(title, fontsize=18) + plt.yticks(fontsize=12) + plt.xticks(fontsize=12) + plt.ylabel( + "Volume of IW SLC products available [GigaOctet]\nstacked histogram", + fontsize=15, + ) + plt.show() + + +def volume_wrt_sea_percent(collected_data_norm, title): + collected_data_norm = add_volumetry_column(collected_data_norm) + delta = 10 + sea_perc = np.arange(0, 100, 10) + + plt.figure(dpi=120) + cummul_grp = np.zeros(len(sea_perc)) + for pol in ["1SDV", "1SSV", "1SSH", "1SDH"]: + total_volumes = [] + subset = collected_data_norm[(collected_data_norm["Name"].str.contains(pol))] + for seap in sea_perc: + subset2 = subset[subset["sea_percent"] >= seap] + total_volumes.append(subset2["volume"].sum()) + + total_volumes = np.array(total_volumes) + if (total_volumes > 0).any(): + plt.bar( + sea_perc, + total_volumes, + width=delta, + edgecolor="k", + align="edge", + label=pol, + bottom=cummul_grp, + ) + cummul_grp += total_volumes + + plt.title(title) + + plt.xlabel("minimum % of ocean in the footprint") + plt.ylabel("total volume of the S-1 product considered [To]") + plt.grid(True) + plt.legend() + plt.show() diff --git a/tests_metiers/test_download_WV_multithread_multiuser.py b/tests_metiers/test_download_WV_multithread_multiuser.py index 25f2afc..4453578 100644 --- a/tests_metiers/test_download_WV_multithread_multiuser.py +++ b/tests_metiers/test_download_WV_multithread_multiuser.py @@ -27,6 +27,14 @@ parser = argparse.ArgumentParser(description="highleveltest-fetch_OCN_WV_IDs") parser.add_argument("--verbose", action="store_true", default=False) + parser.add_argument("--forcedownload", action="store_true", default=False, + help='True -> no test of existence of the products in spool and archive directories.') + parser.add_argument( + "--logingroup", + help="name of the group of CDSE account in the localconfig.yml [default=logins]", + default="logins", + required=False, + ) parser.add_argument( "--listing", default=default_listing, @@ -52,8 +60,9 @@ assert os.path.exists(listing) # listing = './example_WV_OCN_listing.txt' # outputdir = conf["test_default_output_directory"] - logins_group = 'loginsbackfill' - logging.info('logins_group : %s',len(conf[logins_group])) + # logins_group = 'loginsbackfill' + logins_group = args.logingroup + logging.info("logins_group : %s", len(conf[logins_group])) outputdir = args.outputdir inputdf = pd.read_csv(listing, names=["id", "safename"], delimiter=",") if not os.path.exists(outputdir): @@ -64,6 +73,7 @@ list_safename=inputdf["safename"].values, outputdir=outputdir, hideProgressBar=False, - account_group=logins_group + account_group=logins_group, + check_on_disk=args.forcedownload==False, ) logging.info("end of function")