pytroll · yukaribbba · Jun 30, 2024 · Jun 30, 2024 · Jun 30, 2024 · Jun 30, 2024
@@ -77,6 +77,15 @@ Documentation
 
     Satpy API <api/modules>
     faq
+
+.. toctree::
+    :maxdepth: 2
+
+    performance_tests/index
+
+.. toctree::
+    :maxdepth: 1
+
     Release Notes <https://github.com/pytroll/satpy/blob/main/CHANGELOG.md>
     Security Policy <https://github.com/pytroll/satpy/blob/main/SECURITY.md>
 

diff --git a/doc/source/performance_tests/abi_l1b_tests.rst b/doc/source/performance_tests/abi_l1b_tests.rst
@@ -0,0 +1,141 @@
+=================
+Tests for abi_l1b
+=================
+- Datasets: 5 scenes by GOES-16 around solar noon (scan period: UTC 17:00 - 17:10) from 2024.06.17 to 2024.06.21
+- Area and resampling:
+
++-------------+------------+----------------------------------------------------------------------------------+--------------------------+
+| Description | Resolution | Area Definition                                                                  | Resampler                |
++=============+============+==================================================================================+==========================+
+| Full Disk   | 500m       | ``scn.finest_area()``                                                            | ``native``               |
++-------------+------------+----------------------------------------------------------------------------------+--------------------------+
+| Local       | 500m       | - width: 8008, height: 8008                                                      | ``nearest``/``bilinear`` |
+|             |            | - projection: +proj=lcc +lon_0=-96 +lat_1=20 +lat_2=60 +datum=WGS84 +ellps=WGS84 |                          |
+|             |            | - area extent: (-106000, 2635000, 3898000, 6639000)                              |                          |
++-------------+------------+----------------------------------------------------------------------------------+--------------------------+
+
+
+Recommended Settings
+====================
+- ``DASK_ARRAY__CHUNK_SIZE``: **16MiB**
+- ``DASK_NUM_WORKERS``: **8/12/16** are all worth considering
+
+
+Result Table
+============
+
+Fulldisk - native resampling
+----------------------------
++------------+---------+--------+--------+--------+-------+--------+
+| Dask Array | Dask    | Time   | Avg    | Max    | Avg   | Errors |
+| Chunk Size | Num     | (s)    | Memory | Memory | CPU   |        |
+| (MB)       | Workers |        | (GB)   | (GB)   | (%)   |        | 
++============+=========+========+========+========+=======+========+
+| 16         | 8       | 151.01 | 4.96   | 9.36   | 59.54 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 12      | 151.22 | 6.33   | 11.96  | 69.05 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 16      | 152.99 | 7.88   | 14.46  | 68.87 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 8       | 152.23 | 8.75   | 18.61  | 60.45 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 12      | 153.49 | 11.95  | 23.23  | 64.36 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 16      | 155.00 | 15.03  | 34.72  | 64.39 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 8       | 165.17 | 18.23  | 42.97  | 53.77 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 12      | 165.79 | 25.83  | 60.45  | 51.67 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 16      | 166.13 | 33.48  | 78.64  | 58.23 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 8       | 169.79 | 26.62  | 74.84  | 51.73 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 12      | 178.59 | 37.50  | 97.69  | 53.16 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 16      | 197.83 | 45.96  | 122.95 | 50.77 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 8       | 165.92 | 27.69  | 68.20  | 54.77 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 12      | 177.71 | 37.22  | 98.93  | 54.49 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 16      | 205.90 | 49.01  | 124.45 | 50.47 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+
+
+Local area - nearest resampling (with cache)
+--------------------------------------------
++------------+---------+--------+--------+--------+-------+--------+
+| Dask Array | Dask    | Time   | Avg    | Max    | Avg   | Errors |
+| Chunk Size | Num     | (s)    | Memory | Memory | CPU   |        |
+| (MB)       | Workers |        | (GB)   | (GB)   | (%)   |        | 
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 8       | 41.94  | 4.34   | 7.8    | 32.98 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 12      | 41.79  | 4.7    | 9.51   | 38.32 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 16      | 41.99  | 5.05   | 10.4   | 39.35 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 8       | 39.94  | 4.94   | 12.01  | 34.29 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 12      | 39.81  | 5.46   | 16.26  | 37.66 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 16      | 39.99  | 6.21   | 20.67  | 36.37 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 8       | 41.38  | 6.17   | 19.36  | 33.5  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 12      | 40.29  | 7.03   | 23.65  | 37.03 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 16      | 39.91  | 7.44   | 25.28  | 38.13 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 8       | 40.43  | 7.15   | 23.24  | 34.66 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 12      | 40.31  | 7.09   | 22.12  | 35.33 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 16      | 39.94  | 7.31   | 23.11  | 36.4  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 8       | 42.9   | 7.85   | 26.48  | 31.27 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 12      | 43.78  | 7.94   | 27.72  | 30.78 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 16      | 42.17  | 8.05   | 28.56  | 31.87 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+
+
+Local area - bilinear resampling (with cache)
+---------------------------------------------
++------------+---------+--------+--------+--------+-------+--------+
+| Dask Array | Dask    | Time   | Avg    | Max    | Avg   | Errors |
+| Chunk Size | Num     | (s)    | Memory | Memory | CPU   |        |
+| (MB)       | Workers |        | (GB)   | (GB)   | (%)   |        | 
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 8       | 196.78 | 12.65  | 37.36  | 7.81  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 12      | 141.56 | 12.06  | 39.92  | 10.35 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 16         | 16      | 144.00 | 12.74  | 41.58  | 10.75 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 8       | 144.10 | 12.26  | 38.74  | 10.05 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 12      | 146.39 | 13.09  | 42.94  | 10.70 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 32         | 16      | 150.53 | 13.81  | 39.60  | 10.68 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 8       | 152.96 | 12.99  | 40.93  | 10.45 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 12      | 163.94 | 13.73  | 41.10  | 9.97  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 64         | 16      | 151.42 | 14.65  | 42.42  | 11.62 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 8       | 219.15 | 14.22  | 41.90  | 8.11  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 12      | 182.07 | 15.32  | 41.61  | 10.12 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 96         | 16      | 195.08 | 15.85  | 41.35  | 10.42 | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 8       | 174.62 | 14.69  | 39.29  | 9.95  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 12      | 214.65 | 16.17  | 40.08  | 9.53  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
+| 128        | 16      | 198.26 | 15.97  | 46.53  | 9.62  | N/A    |
++------------+---------+--------+--------+--------+-------+--------+
diff --git a/doc/source/performance_tests/index.rst b/doc/source/performance_tests/index.rst
@@ -0,0 +1,81 @@
+=================
+Performance Tests
+=================
+
+For better performace tweaks on specific readers, here're the results from a series
+of tests involving ``DASK_ARRAY__CHUNK_SIZE``, ``DASK_NUM_WORKERS`` and other options
+mentioned in :doc:`FAQ <../faq>`.
+
+.. toctree::
+    :maxdepth: 1
+
+    abi_l1b_tests
+
+
+Test platform
+-------------
++----------+--------------------------------------+
+| CPU      | 1x 8-core, 8-thread i7-9700k @4.6GHz |
++----------+--------------------------------------+
+| Memory   | 2x 32GB DDR4                         |
++----------+--------------------------------------+
+| SSD      | 1x Samsung 980 Pro PCI-E 2TB         |
++----------+--------------------------------------+
+| OS       | Windows 11 23H2 Workstation Pro      |
++----------+--------------------------------------+
+
+
+Conda environment
+-----------------
++------------+-------------+
+| Channel    | conda-forge |
++------------+-------------+
+| Python     | 3.12.3      |
++------------+-------------+
+| dask       | 2024.6.2    |
++------------+-------------+
+| numpy      | 2.0.0       |
++------------+-------------+
+| satpy      | 0.49        |
++------------+-------------+
+| pyresample | 1.28.3      |
++------------+-------------+
+| pyspectral | 0.13.1      |
++------------+-------------+
+| psutil     | 6.0.0       |
++------------+-------------+
+
+
+Test procedure
+--------------
+- Each round will go through 5 scenes to calculate average.
+
+- The composite will usually be the default ``true_color`` which requires heavy computation like atmospheric corrections.
+
+- A new monitor thread using ``psutil`` will record the CPU and memory usage synchronously. The sample rate is around 0.5 seconds.
+
+- When the current round finished, the machine will take a 2-min rest to let the CPU cool down.
+
+- After that, reboot will clear the system cache and prevent the test program from taking advantage of it.
+
+
+Test conditions
+---------------
++------------------------------------+--------------------------------------------------------+
+| DASK_ARRAY__CHUNK_SIZE (in MiB)    | 16, 32, 64, 96, 128                                    |
++------------------------------------+--------------------------------------------------------+
+| DASK_ARRAY__CHUNK_SIZE (in arrays) | 512x512, 1024x1024, 2048x2048, 3072x3072, 4096x4096    |
++------------------------------------+--------------------------------------------------------+
+| DASK_NUM_WORKERS                   | 8, 12, 16                                              |
++------------------------------------+--------------------------------------------------------+
+| OMP_NUM_THREADS                    | 8                                                      |
++------------------------------------+--------------------------------------------------------+
+| generate=False                     | Used when the composite requires different resolutions |
++------------------------------------+--------------------------------------------------------+
+| nprocs=8                           | Used on ``nearest`` or ``bilinear`` resampling         |
++------------------------------------+--------------------------------------------------------+
+| resampling cache                   | Used on ``nearest`` or ``bilinear`` resampling         |
++------------------------------------+--------------------------------------------------------+
+
+General conclusions
+-------------------