From e102d82d5f9e434a5ddce1a947dd76ad744cc367 Mon Sep 17 00:00:00 2001 From: Rasmus Kronberg <43936697+rkronberg@users.noreply.github.com> Date: Wed, 8 May 2024 15:43:45 +0300 Subject: [PATCH] Course development: Topic 8 - Managing large datasets and I/O (#320) * Course development: Topic 8 - Managing large datasets and I/O * add new section * updated slides and exercises * Create fast_localdisks.md * Update index.md * Rename fast_localdisks.md to fast-local-disks.md * Delete part-2/datamigration directory * slides * slides * slides * reorder * small edits * fix nav * fix nav * Course development: Topic 8 - Managing large datasets and I/O * add new section * updated slides and exercises * Create fast_localdisks.md * Update index.md * Rename fast_localdisks.md to fast-local-disks.md * slides * slides * reorder * small edits * fix nav * fix nav * move files * ood * formatting * permalinks * ood note * ood note --------- Co-authored-by: Laxmana Yetukuri Co-authored-by: Laxman <48151266+yetulaxman@users.noreply.github.com> Co-authored-by: Laxmana Yetukuri --- _slides/08_datamigration_io.md | 205 +++ _slides/img/lustre1.svg | 1055 ++++++++++++ _slides/img/lustre2.svg | 1413 +++++++++++++++++ part-1/allas/allas-bio-data.md | 5 +- part-1/allas/index.md | 3 +- part-1/disk-areas/index.md | 1 - part-2/containers/apptainer-tutorial-part1.md | 2 +- part-2/containers/apptainer-tutorial-part2.md | 2 +- part-2/containers/creating-containers.md | 2 +- part-2/containers/getting-containers.md | 2 +- part-2/containers/index.md | 12 +- part-2/containers/replicating-conda.md | 2 +- part-2/containers/running-installed.md | 2 +- part-2/index.md | 2 +- part-2/installing/binary.md | 2 +- part-2/installing/cmake.md | 6 +- part-2/installing/compilers.md | 4 +- part-2/installing/hpc.md | 2 +- part-2/installing/index.md | 12 +- part-2/installing/java.md | 2 +- part-2/installing/mcl.md | 2 +- part-2/installing/perl.md | 2 +- part-2/installing/python.md | 2 +- part-2/installing/r.md | 2 +- .../allas => part-2/io}/allas-batch-jobs.md | 10 +- .../io/fast-local-disks.md | 10 +- part-2/io/index.md | 18 + part-2/io/local_rclone.md | 137 ++ part-2/workflows/gaussian-hq.md | 2 +- part-2/workflows/hyperqueue.md | 2 +- part-2/workflows/index.md | 10 +- part-2/workflows/scaling.md | 2 +- 32 files changed, 2879 insertions(+), 56 deletions(-) create mode 100644 _slides/08_datamigration_io.md create mode 100644 _slides/img/lustre1.svg create mode 100644 _slides/img/lustre2.svg rename {part-1/allas => part-2/io}/allas-batch-jobs.md (96%) rename part-1/disk-areas/exercise-fastdisks.md => part-2/io/fast-local-disks.md (92%) create mode 100644 part-2/io/index.md create mode 100644 part-2/io/local_rclone.md diff --git a/_slides/08_datamigration_io.md b/_slides/08_datamigration_io.md new file mode 100644 index 00000000..4b1d1bf7 --- /dev/null +++ b/_slides/08_datamigration_io.md @@ -0,0 +1,205 @@ +--- +theme: csc-eurocc-2019 +lang: en +--- + +# Working efficiently with data {.title} + +
+![](https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-sa.png) +
+
+ +All materials (c) 2020-2024 by CSC – IT Center for Science Ltd. +This work is licensed under a **Creative Commons Attribution-ShareAlike** 4.0 +Unported License, [http://creativecommons.org/licenses/by-sa/4.0/](http://creativecommons.org/licenses/by-sa/4.0/) + +
+ +# Outline + +- Efficient file I/O in HPC systems +- Using Allas in batch scripts +- Moving data to/from Allas, IDA and LUMI-O +- Transferring data in sensitive data computing +- Cleaning and backing up data +- Working with remote mounts + +# Parallel file systems + +- A parallel file system (PFS) provides a common file system area that can be accessed from all nodes in a cluster +- Without PFS users would have to always copy all needed data to compute nodes before runs (cf. local disk) + - Also the results would not be visible outside the compute node +- CSC uses **Lustre** parallel file system Puhti and Mahti + +# Lustre + +
+![](img/lustre1.svg){width=100%} +
+
+- One or more metadata servers (MDS) with metadata targets (MDT) that store the file system metadata +- One or more object storage servers (OSS) with object storage targets (OST) that store the actual file system contents +- Connection to nodes via the high-speed interconnect (InfiniBand) +
+ +# What happens when you access a file? + +
+![](img/lustre2.svg){width=100%} +
+
+1. Send metadata request +2. Response with metadata +3. Request data +4. Data response +
+ +# Managing file I/O (1/3) + +- Parallel file system (Lustre): + - Shared across all nodes in the cluster (e.g., `/scratch`) + - Optimized for parallel I/O of large files, slow if accessing lots of small files! +- [Temporary local storage (NVMe)](https://docs.csc.fi/computing/disk/#temporary-local-disk-areas): + - Accessible on login nodes (`$TMPDIR`) and to jobs on some compute nodes (`$LOCAL_SCRATCH`) + - Automatically purged after the job finishes + - Availability varies depending on the supercomputer (Puhti/Mahti/LUMI) + - For example, Mahti has NVMe only on login nodes and GPU nodes + +# Managing file I/O (2/3) + +- To avoid on Lustre: + - Accessing lots of small files, opening/closing a single file in a rapid pace + - Having many files in a single directory +- Use [file striping](https://docs.csc.fi/computing/lustre/#file-striping-and-alignment) to distribute large files across many OSTs +- Use more efficient file formats when possible + - Simply using `tar` and compression is a good start + - High-level I/O libraries and portable file formats like HDF5 or NetCDF + - Enable fast I/O through a single file format and parallel operations + - [AI/ML example: TensorFlow's TFRecords](https://github.com/CSCfi/machine-learning-scripts/blob/master/notebooks/tf2-pets-create-tfrecords.ipynb) – a simple record-oriented binary format +- Docs CSC: [How to achieve better I/O performance on Lustre](https://docs.csc.fi/support/tutorials/lustre_performance/) + +# Managing file I/O (3/3) + +- Use fast local disk to handle file I/O with lots of small files + - Requires staging and unstaging of data + - `tar xf /scratch//big_dataset.tar.gz -C $LOCAL_SCRATCH` +- Processing data in memory allows better performance compared to writing to and reading from the disk + - "Ramdisk" (`/dev/shm`) can be used on Mahti nodes without NVMe + - `export TMPDIR=/dev/shm` +- Do not use databases on `/scratch` + - Instead, consider hosting DBs on cloud resources (e.g., [Pukki DBaaS](https://docs.csc.fi/cloud/dbaas/)) + +# Using Allas in batch jobs + +- Swift (all projects, 8-hour) *vs*. S3 protocol (fixed for a project, persistent) +- `allas-conf` needs setting up CSC password interactively + - Jobs may start late and actual job may take longer than 8 hrs +- Use `allas-conf -k` + - stores password in variable `$OS_PASSWORD` to generate a new token automatically + - a-tools regenerate a token using `$OS_PASSWORD` automatically + - `rclone` requires explicitly setting environment variable in batch jobs: + ```bash + source /appl/opt/allas-cli-utils/allas_conf -f -k $OS_PROJECT_NAME + ``` + +# Configuring Allas for S3 protocol + +- Opening Allas connection in s3mode + - `source allas_conf --mode s3cmd` +- Connection is persistent +- Usage: + - `s3cmd` with endpoint `s3:` + - `rclone` with endpoint `s3allas:` + - `a-put`/`a-get` with `-S` flag + +# How to use LUMI-O from Puhti/Mahti + +- LUMI-O is very similar to Allas, but it uses only S3 protocol +- In Puhti and Mahti, connection to LUMI-O can be opened with command: + - `allas-conf --lumi` +- Usage: + - Using LUMI-O with `rclone` (endpoint is `lumi-o:`) + - e.g., `rclone lsd lumi-o:` + - One can use a-tools with option `--lumi` + - e.g., `a-list --lumi` +- Docs CSC: [Using Allas and LUMI-O from LUMI](https://docs.csc.fi/data/Allas/allas_lumi/) + + +# Moving data between LUMI-O and Allas + +- Requires activating connections to both LUMI-O and Allas at the same time: + - `allas-conf --mode s3cmd` + - `allas-conf --lumi` +- Use `rclone` with `s3allas:` as endpoint for Allas and `lumi-o`: for LUMI-O + - `rclone copy -P lumi-o:lumi-bucket/object s3allas:allas-bucket/` + +# Moving data between IDA and Allas + +- Needs transfer of data *via* supercomputer (e.g., Puhti) +- First, [configure IDA in CSC supercomputers](https://docs.csc.fi/data/ida/using_ida/). For example: + +```bash +module load ida +ida_configure +ida upload /test123/data1 test_data +ida download /project1 project1_data.zip +``` + +- Then, move data between Puhti and Allas + +# Transferring data for sensitive data computing + +- CSC sensitive data services: SD Connect and SD Desktop, use service-specific encryption +- SD Desktop is able to read encrypted data from Allas + - If you want to make your data available in SD Desktop, you need to encrypt the data with the *CSC public key* before data is uploaded to Allas + - Use `a-put` with option `--sdx` or command `a-encrypt` to make your Allas data compatible with SD Desktop + - Upcoming version of SD Connect will change the situation, but new server will be compatible with previously uploaded data as well + +# Questions that users should consider + +- Should I store each file as a separate object, or should I collect them into bigger chunks? + - In general: consider how you use the data +- Should I use compression? +- Who can use the data: projects and access rights? +- What will happen to my data later on? +- How to keep track of all the data I have in Allas? + +# Cleaning and backing up data (1/3) + +- **[Disk cleaning](https://docs.csc.fi/support/tutorials/clean-up-data/#automatic-removal-of-files)** + - In force for project disk areas under `/scratch` **on Puhti** + - Files older than 180 days will be removed periodically + - Listed in a purge list, e.g. `/scratch/purge_lists/project_2001234/path_summary.txt` + - *[LCleaner](https://docs.csc.fi/support/tutorials/clean-up-data/#using-lcleaner-to-check-which-files-will-be-automatically-removed)* tool can help you discover which of your files have been targeted for automatic removal +- **Best practice tips** + - Don't save everything automatically + - Use *[LUE](https://docs.csc.fi/support/tutorials/lue/)* tool to analyze your disk usage + - Avoid `du` and `find -size`, these commands are heavy on the file system + - Move important data not in current use to Allas + +# Cleaning and backing up data (2/3) + +- [`allas-backup`](https://docs.csc.fi/data/Allas/using_allas/a_backup/) command provides an easy-to-use command-line interface for the `restic` backup tool +- Backing up differs from normal storing: + - Incremental (efficient) and version control (no overriding) + - Based on hashes and requires more computing + - Efficient way to store different versions of a dataset + +# Cleaning and backing up data (3/3) + +- Please note that Allas is intended for storing *active data* +- Project lifetime is usually 1-5 years +- Commands for backing up data: + - `allas-backup --help` + - `allas-backup [add] file-or-directory` + - `allas-backup list ` + - `allas-backup restore snapshot-id` + +# Working with remote disk mounts + +- Using `sshfs` command in Linux/MacOS: + - `mkdir csc_home` + - `sshfs @puhti.csc.fi:/users/ csc_home` +- To unmount the file system, give the command: + - `fusermount -u csc_home` diff --git a/_slides/img/lustre1.svg b/_slides/img/lustre1.svg new file mode 100644 index 00000000..c90950e6 --- /dev/null +++ b/_slides/img/lustre1.svg @@ -0,0 +1,1055 @@ + + + +image/svg+xml +OSS 0MDS 0Compute nodeInfinibandMDT 1MDT 0OST 0OST 1OST 2 diff --git a/_slides/img/lustre2.svg b/_slides/img/lustre2.svg new file mode 100644 index 00000000..f2994201 --- /dev/null +++ b/_slides/img/lustre2.svg @@ -0,0 +1,1413 @@ + + + +image/svg+xml +OSS 0MDS 0Compute nodeInfiniband1234MDT 1MDT 0OST 0OST 1OST 2 diff --git a/part-1/allas/allas-bio-data.md b/part-1/allas/allas-bio-data.md index ed2cef29..43bab719 100644 --- a/part-1/allas/allas-bio-data.md +++ b/part-1/allas/allas-bio-data.md @@ -3,16 +3,15 @@ layout: default title: Using Allas with bio data parent: 7. Allas grand_parent: Part 1 -nav_order: 4 +nav_order: 3 has_children: false has_toc: false permalink: /hands-on/allas/allas-tutorial.html --- - # Using Allas in CSC's HPC environment - Before the actual exercise, open a view to the Allas service in your browser using the Puhti web interface. +Before the actual exercise, open a view to the Allas service in your browser using the Puhti web interface. 1. Go to and login with your account. 2. Configure an Allas S3 connection using the _Cloud storage configuration_ tool. diff --git a/part-1/allas/index.md b/part-1/allas/index.md index 00d46f75..676c73b8 100644 --- a/part-1/allas/index.md +++ b/part-1/allas/index.md @@ -19,5 +19,4 @@ has_toc: false 1. [Essential tutorial - Basic usage of Allas]({{ site.baseurl }}{% link part-1/allas/allas-usage.md %}) 2. [Tutorial - File backup with Allas]({{ site.baseurl }}{% link part-1/allas/allas-backup.md %}) -3. [Tutorial - Allas in batch jobs]({{ site.baseurl }}{% link part-1/allas/allas-batch-jobs.md %}) -4. [Advanced tutorial - Using Allas with bio data]({{ site.baseurl }}{% link part-1/allas/allas-bio-data.md %}) +3. [Advanced tutorial - Using Allas with bio data]({{ site.baseurl }}{% link part-1/allas/allas-bio-data.md %}) \ No newline at end of file diff --git a/part-1/disk-areas/index.md b/part-1/disk-areas/index.md index 02571048..ee3abe0a 100644 --- a/part-1/disk-areas/index.md +++ b/part-1/disk-areas/index.md @@ -18,4 +18,3 @@ has_toc: false 1. [Essential tutorial - Main disk areas in CSC's computing environment]({{ site.baseurl }}{% link part-1/disk-areas/maindisks.md %}) 2. [Essential tutorial - Finding out where you have a lot of data]({{ site.baseurl }}{% link part-1/disk-areas/lue.md %}) 3. [Tutorial - Fast disk areas in CSC's computing environment]({{ site.baseurl }}{% link part-1/disk-areas/tutorial-fastdisks.md %}) -4. [Advanced exercise - I/O intensive computing tasks]({{ site.baseurl }}{% link part-1/disk-areas/exercise-fastdisks.md %}) diff --git a/part-2/containers/apptainer-tutorial-part1.md b/part-2/containers/apptainer-tutorial-part1.md index c52aaf11..75c00554 100644 --- a/part-2/containers/apptainer-tutorial-part1.md +++ b/part-2/containers/apptainer-tutorial-part1.md @@ -1,7 +1,7 @@ --- layout: default title: Apptainer tutorial 1 -parent: 9. Containers and Apptainer +parent: 10. Containers and Apptainer grand_parent: Part 2 nav_order: 1 has_children: false diff --git a/part-2/containers/apptainer-tutorial-part2.md b/part-2/containers/apptainer-tutorial-part2.md index 1d34f1ea..7c5b3668 100644 --- a/part-2/containers/apptainer-tutorial-part2.md +++ b/part-2/containers/apptainer-tutorial-part2.md @@ -1,7 +1,7 @@ --- layout: default title: Apptainer tutorial 2 -parent: 9. Containers and Apptainer +parent: 10. Containers and Apptainer grand_parent: Part 2 nav_order: 3 has_children: false diff --git a/part-2/containers/creating-containers.md b/part-2/containers/creating-containers.md index 4fce2aa3..19ead23c 100644 --- a/part-2/containers/creating-containers.md +++ b/part-2/containers/creating-containers.md @@ -1,7 +1,7 @@ --- layout: default title: Creating Apptainer containers -parent: 9. Containers and Apptainer +parent: 10. Containers and Apptainer grand_parent: Part 2 nav_order: 6 has_children: false diff --git a/part-2/containers/getting-containers.md b/part-2/containers/getting-containers.md index 3a2b690c..9c029b42 100644 --- a/part-2/containers/getting-containers.md +++ b/part-2/containers/getting-containers.md @@ -1,7 +1,7 @@ --- layout: default title: How to get containers -parent: 9. Containers and Apptainer +parent: 10. Containers and Apptainer grand_parent: Part 2 nav_order: 4 has_children: false diff --git a/part-2/containers/index.md b/part-2/containers/index.md index 5118cfd7..eb8c59e1 100644 --- a/part-2/containers/index.md +++ b/part-2/containers/index.md @@ -1,19 +1,19 @@ --- layout: default -title: 9. Containers and Apptainer +title: 10. Containers and Apptainer parent: Part 2 -nav_order: 2 +nav_order: 3 has_children: true has_toc: false --- -# 9. Containers and Apptainer +# 10. Containers and Apptainer -## [9.1 Slides](https://a3s.fi/CSC_training/09_singularity.html) +## [10.1 Slides](https://a3s.fi/CSC_training/09_singularity.html) -## [9.2 Video: Containers](https://video.csc.fi/media/t/0_0ws9ei53) +## [10.2 Video: Containers](https://video.csc.fi/media/t/0_0ws9ei53) -## 9.3 Tutorials and exercises +## 10.3 Tutorials and exercises 1. [Essential tutorial - Introduction to Apptainer]({{ site.baseurl }}{% link part-2/containers/apptainer-tutorial-part1.md %}) 2. [Tutorial - Apptainer introduction continued]({{ site.baseurl }}{% link part-2/containers/apptainer-tutorial-part2.md %}) diff --git a/part-2/containers/replicating-conda.md b/part-2/containers/replicating-conda.md index 2b13d95d..4bf4b423 100644 --- a/part-2/containers/replicating-conda.md +++ b/part-2/containers/replicating-conda.md @@ -1,7 +1,7 @@ --- layout: default title: Replicating a Conda environment -parent: 9. Containers and Apptainer +parent: 10. Containers and Apptainer grand_parent: Part 2 nav_order: 5 has_children: false diff --git a/part-2/containers/running-installed.md b/part-2/containers/running-installed.md index 492ce968..54feeeae 100644 --- a/part-2/containers/running-installed.md +++ b/part-2/containers/running-installed.md @@ -1,7 +1,7 @@ --- layout: default title: Running containerized applications -parent: 9. Containers and Apptainer +parent: 10. Containers and Apptainer grand_parent: Part 2 nav_order: 3 has_children: false diff --git a/part-2/index.md b/part-2/index.md index 9c4a5df6..2d3b4755 100644 --- a/part-2/index.md +++ b/part-2/index.md @@ -8,6 +8,6 @@ has_children: true # Part 2 -- Next steps Ready to dive deeper? The second part covers more advanced topics, such as -working efficiently with large datasets (TBA), installing own software, containerization +working efficiently with large datasets, installing own software, containerization and efficient HPC workflows. {: .fs-6 .fw-300 } diff --git a/part-2/installing/binary.md b/part-2/installing/binary.md index f301befa..14e712b3 100644 --- a/part-2/installing/binary.md +++ b/part-2/installing/binary.md @@ -1,7 +1,7 @@ --- layout: default title: Installing binary applications -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 1 has_children: false diff --git a/part-2/installing/cmake.md b/part-2/installing/cmake.md index 93381ffc..29321333 100644 --- a/part-2/installing/cmake.md +++ b/part-2/installing/cmake.md @@ -1,12 +1,12 @@ --- layout: default -title: Installing using cmake -parent: 8. Installing own software +title: Installing using CMake +parent: 9. Installing own software grand_parent: Part 2 nav_order: 6 has_children: false has_toc: false -permalink: /hands-on/installing/installing_using_cmake.html +permalink: /hands-on/installing/installing_cmake.html --- # Installing using CMake diff --git a/part-2/installing/compilers.md b/part-2/installing/compilers.md index 11d7b3bd..b2698da2 100644 --- a/part-2/installing/compilers.md +++ b/part-2/installing/compilers.md @@ -1,7 +1,7 @@ --- layout: default -title: Performance Optimization -parent: 8. Installing own software +title: Optimizing compiler options +parent: 9. Installing own software grand_parent: Part 2 nav_order: 7 has_children: false diff --git a/part-2/installing/hpc.md b/part-2/installing/hpc.md index 9606ea7e..9173c1c5 100644 --- a/part-2/installing/hpc.md +++ b/part-2/installing/hpc.md @@ -1,7 +1,7 @@ --- layout: default title: Installing own C, C++ or Fortran programs -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 7 has_children: false diff --git a/part-2/installing/index.md b/part-2/installing/index.md index dc863210..2371e9bf 100644 --- a/part-2/installing/index.md +++ b/part-2/installing/index.md @@ -1,19 +1,19 @@ --- layout: default -title: 8. Installing own software +title: 9. Installing own software parent: Part 2 -nav_order: 1 +nav_order: 2 has_children: true has_toc: false --- -# 8. Installing your own software +# 9. Installing your own software -## [8.1 Slides](https://a3s.fi/CSC_training/08_installing.html) +## [9.1 Slides](https://a3s.fi/CSC_training/08_installing.html) -## [8.2 Video: Installing own software](https://video.csc.fi/media/t/0_anzwy1es) +## [9.2 Video: Installing own software](https://video.csc.fi/media/t/0_anzwy1es) -## 8.3 Tutorials and exercises +## 9.3 Tutorials and exercises 1. [Essential tutorial - Installing binary applications]({{ site.baseurl }}{% link part-2/installing/binary.md %}) 2. [Essential tutorial - Installing Python packages and environments]({{ site.baseurl }}{% link part-2/installing/python.md %}) diff --git a/part-2/installing/java.md b/part-2/installing/java.md index 0f41b706..416da433 100644 --- a/part-2/installing/java.md +++ b/part-2/installing/java.md @@ -1,7 +1,7 @@ --- layout: default title: Installing Java applications -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 6 has_children: false diff --git a/part-2/installing/mcl.md b/part-2/installing/mcl.md index a3fcb356..5a19a9c2 100644 --- a/part-2/installing/mcl.md +++ b/part-2/installing/mcl.md @@ -1,7 +1,7 @@ --- layout: default title: Installing a simple C code from source -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 4 has_children: false diff --git a/part-2/installing/perl.md b/part-2/installing/perl.md index 60234dcc..8e8f0672 100644 --- a/part-2/installing/perl.md +++ b/part-2/installing/perl.md @@ -1,7 +1,7 @@ --- layout: default title: Installing Perl applications and libraries -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 5 has_children: false diff --git a/part-2/installing/python.md b/part-2/installing/python.md index 16a0abd9..519f99da 100644 --- a/part-2/installing/python.md +++ b/part-2/installing/python.md @@ -1,7 +1,7 @@ --- layout: default title: Installing Python packages and environments -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 2 has_children: false diff --git a/part-2/installing/r.md b/part-2/installing/r.md index f63593e3..6d3dbdd5 100644 --- a/part-2/installing/r.md +++ b/part-2/installing/r.md @@ -1,7 +1,7 @@ --- layout: default title: Installing R application and libraries -parent: 8. Installing own software +parent: 9. Installing own software grand_parent: Part 2 nav_order: 3 has_children: false diff --git a/part-1/allas/allas-batch-jobs.md b/part-2/io/allas-batch-jobs.md similarity index 96% rename from part-1/allas/allas-batch-jobs.md rename to part-2/io/allas-batch-jobs.md index 212b3e97..90128b16 100644 --- a/part-1/allas/allas-batch-jobs.md +++ b/part-2/io/allas-batch-jobs.md @@ -1,12 +1,12 @@ --- layout: default -title: Allas in batch jobs -parent: 7. Allas -grand_parent: Part 1 -nav_order: 3 +title: Using Allas in batch jobs +parent: 8. Working efficiently with data +grand_parent: Part 2 +nav_order: 1 has_children: false has_toc: false -permalink: /hands-on/allas/tutorial_allas-in-batch-jobs.html +permalink: /hands-on/data-io/tutorial_allas-in-batch-jobs.html --- diff --git a/part-1/disk-areas/exercise-fastdisks.md b/part-2/io/fast-local-disks.md similarity index 92% rename from part-1/disk-areas/exercise-fastdisks.md rename to part-2/io/fast-local-disks.md index 3d29e5ea..e3d2ab40 100644 --- a/part-1/disk-areas/exercise-fastdisks.md +++ b/part-2/io/fast-local-disks.md @@ -1,18 +1,16 @@ --- layout: default title: I/O intensive computing -parent: 3. Disk areas -grand_parent: Part 1 +parent: 8. Working efficiently with data +grand_parent: Part 2 nav_order: 4 has_children: false has_toc: false -permalink: /hands-on/disk-areas/disk-areas-exercise-fastdisks.html +permalink: /hands-on/data-io/io-exercise-fastdisks.html --- # How to run I/O intensive computing tasks efficiently? -πŸ’¬ _This exercise requires usage of the batch queue system. Feel free to carry on or return to this after Topic 5._ - ## Background Lustre-based project-specific directories `/scratch` and `/projappl` can store large amounts of data and are accessible to all compute nodes of Puhti. However, these directories are not good for managing a large number of files or performing intensive input/output (I/O) operations. If you need to work with a huge number of smaller files or perform frequent reads/writes, you should consider using the NVMe-based local temporary scratch directories, either through normal or interactive batch jobs. Read more about the advantages of using the local scratch disk in [Docs CSC](https://docs.csc.fi/support/faq/local_scratch_for_data_processing/). @@ -70,7 +68,7 @@ unset XDG_RUNTIME_DIR # Get rid of some unnecessary warnings cd $LOCAL_SCRATCH pwd -singularity pull --name trinity.simg docker://trinityrnaseq/trinityrnaseq +apptainer pull --name trinity.simg docker://trinityrnaseq/trinityrnaseq mv trinity.simg /scratch//$USER/ ``` diff --git a/part-2/io/index.md b/part-2/io/index.md new file mode 100644 index 00000000..ed632887 --- /dev/null +++ b/part-2/io/index.md @@ -0,0 +1,18 @@ +--- +layout: default +title: 8. Working efficiently with data +parent: Part 2 +nav_order: 1 +has_children: true +has_toc: false +--- + +# 8. Working efficiently with data + +## [8.1 Slides](https://a3s.fi/CSC_training/08_datamigration_io.html) + +## 8.2 Tutorials and exercises + +1. [Essential tutorial - Using Allas in batch jobs]({{ site.baseurl }}{% link part-2/io/allas-batch-jobs.md %}) +2. [Tutorial - Using Allas with local `rclone`]({{ site.baseurl }}{% link part-2/io/local_rclone.md %}) +3. [Exercise - I/O intensive computing]({{ site.baseurl }}{% link part-2/io/fast-local-disks.md %}) diff --git a/part-2/io/local_rclone.md b/part-2/io/local_rclone.md new file mode 100644 index 00000000..f588a694 --- /dev/null +++ b/part-2/io/local_rclone.md @@ -0,0 +1,137 @@ +--- +layout: default +title: Using Allas with local rclone +parent: 8. Working efficiently with data +grand_parent: Part 2 +nav_order: 3 +has_children: false +has_toc: false +permalink: /hands-on/data-io/disk-areas-exercise-fastdisks.html +--- + +# Using Allas with rclone from your local computer + +πŸ’­ The graphical user interfaces of Allas can normally manage data transfers +between Allas and your local computing environment as long as the amount of +data and number of files is small. However, if you need to move large amounts +of data, then using command-line tools like `rclone` or `allas-cli-utils` could +be a more efficient way to use Allas. + +πŸ’¬ In this exercise, we'll study how you can use Allas from your own computer +using `rclone`, which is available for all common operating systems including +Windows and macOS. Note that in macOS and Linux machines you can also install +the whole allas-cli-utils repository locally. + +## Step 1. Installing rclone + +☝🏻 If you already have `rclone` command available, skip to [Step 2](#step-2-configuring-rclone-swift-connection-in-local-machine). + +1. Download `rclone` executable to your own machine. Executables can be found + from . +2. In case of Windows, if you don’t know which version to choose, try the + [Intel/AMD 64 bit version](https://downloads.rclone.org/v1.66.0/rclone-v1.66.0-windows-amd64.zip). + +## Step 2. Configuring rclone-Swift connection in local machine + +1. Start the process by opening a command shell and executing command: + 1. Windows: `.\rclone.exe config` + 2. Linux and macOS: `./rclone config` +2. The configuration process in now done interactively in your command shell. + In case of Allas, you need to do the following selections: + 1. Select **n** to create a *New remote* + 2. Name the remote as: `allas` + 3. From the list of storage protocols, select the number corresponding to: + *OpenStack Swift (Rackspace Cloud Files, Memset Memstore, OVH)* + 4. Select authentication option **2**: *Get swift credentials from + environment vars*. + 5. Select the default blank setting for all the remaining settings until + you are back in the starting menu of the configuration process. + 6. Finally, choose **q** to stop the configuration process. +3. You need to do this configuration only once. + +## Step 3. Authentication + +πŸ’­ In addition to the configuration, you must define a set of environment +variables to authenticate your Allas connection each time you start using +`rclone`. If you have access to Puhti, you can use it as an easy way to +generate a list of commands to set the authentication: + +1. Open a terminal connection to Puhti and activate there a connection to the + Allas project you wish to use so that you add option `--show-powershell` + (Windows) or option `--show-shell` (macOS and Linux) to the `allas-conf` + command. +2. With these options, the configuration process prints out environment + variable setting commands that you can run in your local machine to enable + authentication to Allas. + +### Windows PowerShell + +1. If your local machine is running Windows, execute the following commands in + Puhti: + ```bash + module load allas + allas-conf --show-powershell + ``` +2. Copy the last four lines, starting with `$Env:`, to the local PowerShell and + execute them. Then, test the `rclone` connection with command: + ```console + .\rclone.exe lsd allas: + ``` +3. Note that also in this case the connection will work only for the next 8 + hours. + +### macOS and Linux (bash and zsh) + +1. If your local machine is running macOS or Linux, then the default shell is + often `bash` or `zsh`. To activate Allas connection in these cases, run the + following commands in Puhti: + ```bash + module load allas + allas-conf --show-shell + ``` +2. Copy the last four lines, starting with `export`, to the local shell session + and execute them. Then, test the `rclone` connection with command: + ```bash + ./rclone lsd allas: + ``` +3. Note that also in this case the connection will work only for the next 8 + hours. + +## Step 4. Upload and download from local computer + +πŸ’¬ Use `rclone` to upload a small directory from your local computer to Allas. +The sample commands below are written for Windows PowerShell. In macOS and +Linux you should replace `rclone.exe` with `rclone` and `.\` in the directory +paths with `./`. + +☝🏻 For this test, choose some unimportant directory that contains only a small +amount of data (less than 1 GiB). + +1. First check what would be copied by running `rclone` command with option + `--dry-run`. Prefix the target bucket name in Allas with your username to + make it unique. So in the sample commands below you should replace + `local-directory` and `username` with you own values. + ```console + .\rclone.exe copy -P --dry-run .\local-directory allas:username_local-directory + ``` +2. If the test command above works, then run the same command without + `--dry-run` to actually copy the data: + ```console + .\rclone.exe copy -P .\local-directory allas:username_local-directory + ``` +3. What was the speed of transfer? Calculate how long time it would take to + copy 10 GiB of data with the same speed? +4. Check the results with command: + ```console + .\rclone.exe ls allas:username_local-directory + ``` +5. Finally, copy the same data to a new directory on your local computer: + ```console + .\rclone.exe copy -P allas:username_local-directory .\username_local-directory + ``` +6. What was the speed of transfer? Calculate how long time it would take to + copy 10 GiB of data with the same speed? + +## More information + +πŸ’‘ Docs CSC: [Local `rclone` configuration for Allas](https://docs.csc.fi/data/Allas/using_allas/rclone_local/) diff --git a/part-2/workflows/gaussian-hq.md b/part-2/workflows/gaussian-hq.md index 5578b8b4..07b67425 100644 --- a/part-2/workflows/gaussian-hq.md +++ b/part-2/workflows/gaussian-hq.md @@ -1,7 +1,7 @@ --- layout: default title: Running Gaussian with sbatch-hq -parent: 10. How to speed up jobs +parent: 11. How to speed up jobs grand_parent: Part 2 nav_order: 3 has_children: false diff --git a/part-2/workflows/hyperqueue.md b/part-2/workflows/hyperqueue.md index 3d9fa2e4..56179df4 100644 --- a/part-2/workflows/hyperqueue.md +++ b/part-2/workflows/hyperqueue.md @@ -1,7 +1,7 @@ --- layout: default title: Processing many files with HyperQueue -parent: 10. How to speed up jobs +parent: 11. How to speed up jobs grand_parent: Part 2 nav_order: 2 has_children: false diff --git a/part-2/workflows/index.md b/part-2/workflows/index.md index 3d9415c2..2ddb0662 100644 --- a/part-2/workflows/index.md +++ b/part-2/workflows/index.md @@ -1,17 +1,17 @@ --- layout: default -title: 10. How to speed up jobs +title: 11. How to speed up jobs parent: Part 2 -nav_order: 3 +nav_order: 4 has_children: true has_toc: false --- -# 10. How to speed up jobs +# 11. How to speed up jobs -## [10.1 Slides](https://a3s.fi/CSC_training/10_speed_up_jobs.html) +## [11.1 Slides](https://a3s.fi/CSC_training/10_speed_up_jobs.html) -## 10.2 Tutorials and exercises +## 11.2 Tutorials and exercises 1. [Tutorial - Performing a simple scaling test]({{ site.baseurl }}{% link part-2/workflows/scaling.md %}) 2. [Tutorial - Processing many files with HyperQueue]({{ site.baseurl }}{% link part-2/workflows/hyperqueue.md %}) diff --git a/part-2/workflows/scaling.md b/part-2/workflows/scaling.md index 7a824cdd..9bc73201 100644 --- a/part-2/workflows/scaling.md +++ b/part-2/workflows/scaling.md @@ -1,7 +1,7 @@ --- layout: default title: Performing a simple scaling test -parent: 10. How to speed up jobs +parent: 11. How to speed up jobs grand_parent: Part 2 nav_order: 1 has_children: false