Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of UTU comments to EO guide #1629

Merged
merged 8 commits into from
Apr 19, 2023
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 24 additions & 22 deletions docs/support/tutorials/gis/eo_guide.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Earth Observation guide

This guide aims to help researchers to work with Earth Observation (EO) data using CSC's computing resources. The purpose of this guide is to give an overview of available options, so it would be easier to decide if CSC has suitable services for your EO reseach. It also helps you find the right data and tools for raster data based EO tasks. This guide focuses on spaceborne platforms. However, many tools and concepts also apply to airborne platforms. If you are interested in the fundamentals of EO, please check the [resources and further reading section](#resources-and-further-reading).
This guide aims to help researchers to work with Earth Observation (EO) data using CSC's computing resources. The purpose of this guide is to give an overview of available options, so it would be easier to decide if CSC has suitable services for your EO research. It also helps you find the right data and tools for raster data based EO tasks. This guide focuses on spaceborne platforms. However, many tools and concepts also apply to airborne platforms. If you are interested in the fundamentals of EO, please check the [resources and further reading section](#resources-and-further-reading).

**What are the benefits of using EO data?**

Expand All @@ -18,11 +18,11 @@ This guide aims to help researchers to work with Earth Observation (EO) data usi

For working with EO data in general, there are three main options:

1) **EO specific services**, which provide both data and advanced ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data. Often these have fees for using.
1) **EO specific services**, which provide both data and advanced ready-to-use processing environments. Usually these give better user experience and efficiency, but the services might be limited in computing power, available tools and options for adding own data. Often these have fees for using. Examples are [Google Earth Engine](https://earthengine.google.com/) and [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com).

2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They also provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing. The data download may be free of charge or have a small cost, depending on the amount of data needed.
2) **Cloud services** with access to EO data. Practically, the data is often stored in object-storage and can be accessed as independent service. They also provide general computing services, such as virtual machines, to which EO tools need to be installed by the end-user. These options usually have some fees, mainly for processing. The data download may be free of charge or have a small cost, depending on the amount of data needed. Examples are [Data and Information Access Services (DIAS)](https://www.copernicus.eu/en/access-data/dias) and [Amazon Web Services](https://registry.opendata.aws/).

3) **Own computing environment** - PC, local cluster. Data needs to be downloaded and all tools must be installed to this system. On the other hand, it gives more freedom to select the tools and set-up. Usually this does not cause any extra costs, but the computing power is usually rather limited.
3) **Own computing environment** - PC, local cluster, virtual machines. Data needs to be downloaded and all tools must be installed to this system. On the other hand, it gives more freedom to select the tools and set-up. Usually this does not cause any extra costs, but the computing power is usually rather limited.

CSC services do not fit well in this categorization, as they provide some features from all of these. **CSC computing services provide a lot of computing power and storage space, and they are free of charge** for Finnish researchers for academic or educational use.

Expand All @@ -32,17 +32,17 @@ Puhti has also a lot of [pre-installed applications](#what-applications-are-avai

At CSC, [some Finnish EO datasets](#eo-data-at-csc) are available for direct use. In many cases, however, downloading EO data from other services (see [list of EO data download services](#eo-data-download-services)) is a required step of the process. Puhti and cPouta provide local storage of ~1-20 Tb. For more storage space, [Allas object storage](../../../data/Allas/index.md) can be used.

Using CSC computing services requires basic Linux skills and ability to use some scripting language or command-line tools. In addition, supercomputers and virtual machines require you to understand some specific concepts, so it takes a few hours to get started. The new Puhti web interface makes the start considerably easier, providing a desktop environment in the web browser, which enables the use of tools with Graphical User Interfaces (GUI) and also tools like R Studio and JupyterLab for an easy start with R, Python and Julia.
Using CSC computing services requires basic Linux skills and ability to use some scripting language (for example Python, R, Julia) or command-line tools. In addition, supercomputers and virtual machines require you to understand some specific concepts, so it takes a few hours to get started. The [Puhti web interface](https://www.puhti.csc.fi/) makes the start considerably easier, providing a desktop environment in the web browser, which enables the use of tools with Graphical User Interfaces (GUI) and also tools like R Studio and JupyterLab for an easy start with R, Python and Julia.

## What data do I need?

What to consider when chosing data:
When starting a task that requires EO data, there are multiple factors to consider. The decision on what are the most important factors depends heavily on the task and the resources available. The following list summarizes what one needs to consider when defining the data needs:

* Sensor
* Multispectal
* Hyperspectral
* RADAR (Radio Detection and Ranging), SAR (Synthetic Aperture Radar)
* LiDAR (Light Detection and Ranging)
* Sensor: Different sensors cover different intervals of the electromagnetic spectrum and with that show different properties of the observed areas, they can be active or passive:
ktiits marked this conversation as resolved.
Show resolved Hide resolved
* Multispectral: multiple intervals around the visible spectrum of the EM are observed at the same time
* Hyperspectral: more but usually shorter intervals of the EM are observed at the same time
* RADAR (Radio Detection and Ranging), SAR (Synthetic Aperture Radar), active sensing in the microwave/radio frequencies of the EM spectrum
* LiDAR (Light Detection and Ranging), using a laser as energy source in the optical part of the EM spectrum
* Note that depending on the wavelengths observed, clouds, ground conditions and atmospheric artifacts may result in data gaps
* Resolution
* Temporal: when and how often a certain area is revisited
Expand Down Expand Up @@ -93,7 +93,7 @@ What to consider when chosing data:

!!! default "EO database"

Database of all EO missions and instrument information can be found in the [CEOS EO handbook database](http://database.eohandbook.com/database/instrumenttable.aspx)
Database of all EO missions and instrument information can be found in the [CEOS EO handbook database](http://database.eohandbook.com/database/instrumenttable.aspx). See also [EOReader band mapping graphics](https://eoreader.readthedocs.io/en/latest/optical_band_mapping.html) for an overview of observed wavelength intervals for different optical sensors.


## Where can I find the data?
Expand All @@ -116,7 +116,7 @@ Some Finnish EO datasets are available locally at CSC. A STAC catalog for all sp

**[SYKE/FMI, Finnish image mosaics](https://www.syke.fi/fi-FI/Tutkimus__kehittaminen/Tutkimus_ja_kehittamishankkeet/Hankkeet/Paikkatietoalusta_PTA)** : Sentinel-1, Sentinel-2 and Landsat mosaics, for several time periods per year. Some of them are available in Puhti, but not all. [FMI provides also a STAC catalog for these mosaics](https://pta.data.lit.fmi.fi/stac/root.json)

[**European Space Agency's SciHub**](https://scihub.copernicus.eu/dhus/#/home) provides worldwide main products for Sentinel-1, -2 and -3. It requires free registration. Big part of the data is in the "Long term archive" and cannot be downloaded directly, but needs to be requested first. Download is limited to 2 concurrent processes per user. Please note that between end of January and July 2023 this system is being updated to become the [Copernicus Space Ecosystem](https://medium.com/sentinel-hub/new-copernicus-data-access-service-to-support-the-ecosystem-for-earth-observation-412f829355a3). SciHub will continue its full operations until the end of June 2023.
[**European Space Agency's SciHub**](https://scihub.copernicus.eu/dhus/#/home) provides worldwide main products for Sentinel-1, -2 and -3. It requires free registration. Big part of the data is in the "Long term archive" and cannot be downloaded directly, but needs to be requested first (some tools can do that automatically, please check the documentation for the tool of your choice). Download is limited to 2 concurrent processes per user. Please note that between end of January and July 2023 this system is being updated to become the [Copernicus Space Ecosystem](https://medium.com/sentinel-hub/new-copernicus-data-access-service-to-support-the-ecosystem-for-earth-observation-412f829355a3). SciHub will continue its full operations until the end of June 2023.

[**Copernicus Data Space Ecosystem**](https://dataspace.copernicus.eu/) provides access to all Sentinel data with new features for visualisation and data processing. Please stay tuned to the news for latest information on the services available and the [Copernicus Data Space Ecosystem roadmap](https://documentation.dataspace.copernicus.eu/#/Roadmap) for the full release of all functionalities.

Expand Down Expand Up @@ -147,17 +147,19 @@ You can find information about geocomputing using CSC resources and how to get s

### What to consider when choosing a software?

* Functionality
* Interaction type:
There is no single software perfect for every task and taste. The right software depends as much on the task to be worked on, as on the taste and skillset of the user. The following list sunmmarizes things that need to be considered when choosing a software.

* Functionality: Does the software provide the tools you need to reach your goal?
* Interaction type: How do you want to interact with the software?
* Graphical User Interface (GUI)
* Command Line Interface (CLI)
* Scripting
* Scripting
* Technical aspects:
* Reproducibility
* Supported operating systems
* Automation possibility
* Combination possibility
* Computational efficiency
* Reproducibility: Doe the tool provide the possibility to record work steps?
ktiits marked this conversation as resolved.
Show resolved Hide resolved
* Supported operating systems: Can the tool be installed to the operating system available to you?
* Automation possibility: Can the tool execution be automatized for big data processing, if needed?
* Combination possibility: Can you combine the tool with other tools?
* Computational efficiency: Does the tool make good use of the available computational resources (especially GPUs)?
* Support for parallel computing or batch processing
* Open source vs proprietary
* Proprietary tools need licenses which may be expensive and/or limiting the use of the tool
Expand All @@ -182,7 +184,7 @@ You can find information about geocomputing using CSC resources and how to get s

[**QGIS**](../../../apps/qgis.md) - open source tool with GUI for working with spatial data including limited multispectral image processing capabilities. GUI with batch processing possibility and Python interface. Used for example for visualization, map algebra and other raster processing. Many plug-ins available, for EO data processing, check out the [QGIS Semi-automatic classification plugin](https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin.html).

[**R**](../../../apps/r-env-for-gis.md) - Puhti R installation includes a lot of geospatial packages, includeing serveral useful for EO data processing, such as `terra`, `CAST`, `raster` and `spacetime`, also `rstac` for working with STAC catalogs.
[**R**](../../../apps/r-env-for-gis.md) - Puhti R installation includes a lot of geospatial packages, including several useful for EO data processing, such as `terra`, `CAST`, `raster` and `spacetime`, also `rstac` for working with STAC catalogs.

[**Sen2Cor**](../../../apps/sen2cor.md) - a command-line tool for Sentinel-2 Level 2A product generation and formatting.

Expand Down