Factor: Safety- Street lights/Nighttime lights #53

amyburness · 2024-06-21T13:46:13Z

Mapillary

endpoint: object--street-light

This indicator should be calculated by creating 20-meter buffers around streetlights. Raster cells in which 80-100% of their area intersects with the buffers should receive a score of 5. Rasters with 60-79% intersection should be scored as 4, 40-59% as 3, 20-39% as 2, and 1-19% as 1. Areas without any overlap with streetlight buffers should be scored as 0. NOTE: Use nighttime lights only if street lights are absent.

Nighttime Lights

Nighttime lights observation data from a Low light imaging sensor part of the Visible and Infrared Imaging Suite (VIIRS) Day Night Band (DNB). Unit - (average, average-masked, max, median, median-masked, min) nW/cm2/sr, CRS - EPSG:4326 (Geographic Latitude/Longitude), Resolution - 15 arc second (~500m at the Equator)

The text was updated successfully, but these errors were encountered:

carolinamayh · 2024-07-01T09:10:01Z

If data for streetlights or nighttime lights is unavailable, or if the user has disaggregated statistics, they should have the option to provide data on women's perceived safety at the municipal, district, state, or any other required level. The tool would then standardize these scores, percentages, or statistics on a scale from 0 to 5, where 5 indicates the lowest level of violence or the highest level of perceived safety.

javaftw · 2024-07-24T12:44:40Z

SAFStreetLight Function Overview:

Processes streetlight vector data to create a raster of safety scores based on coverage. For detailed explanation, see docs/dev_notes/README.md in the issue-53-factor-safety branch.

Constants:

Light area radius: 20 meters
Score bins: [0, 1, 20, 40, 60, 80, 100]
Scores: [0, 0, 1, 2, 3, 4, 5]

Function:

Setup and input processing
Buffer streetlight points (20m)
Rasterize buffered vector
Calculate coverage using convolution
Assign safety scores
Output final raster

Notes:

Uses QGIS for buffering, GDAL for rasterization, and numpy for calculations
Called by SAFnightTimeLights when input is vector data

Substitute Streetlights Data Generation

Due to the unavailability of actual streetlight location data in the sample dataset and limited time for sourcing, a method was devised to generate substitute data based on available road network information.
Data Source and Preprocessing
The primary data source was the roads shapefile for the area of interest. A subset of roads was extracted based on the following criteria:

'Residential' roads from the 'highways' category
'Tertiary', 'Secondary', and 'Primary' roads, which were merged into a single category of major roads

This extraction process provided a foundation for estimating streetlight distributions.

Residential Streetlight Generation
For residential roads, points were scattered along the geometry at 50-meter intervals. This interval was chosen to approximate the typical spacing of streetlights in residential areas.

Major Road Streetlight Generation
The process for major roads involved additional steps:

The merged major roads were filtered based on segment lengths to exclude longer cross-country roads that typically have less frequent lighting.
Points were then scattered along the remaining geometry at 150-meter intervals, reflecting the generally wider spacing of streetlights on major roads compared to residential areas.

Resulting Distribution
This methodology produces a calculated estimation of light source distribution for development purposes. The resulting point dataset approximates streetlight locations based on road type and typical lighting patterns.

The final distribution provides a reasonable proxy for actual streetlight locations, allowing for the development and testing of the SAFStreetLight function in the absence of real-world data.

javaftw · 2024-07-24T13:04:05Z

"Perceived Safety Data" handling is still outstanding

osundwajeff · 2024-07-24T21:26:21Z

@javaftw

javaftw · 2024-07-24T21:56:32Z

@osundwajeff

javaftw · 2024-07-31T05:41:42Z

If data for streetlights or nighttime lights is unavailable, or if the user has disaggregated statistics, they should have the option to provide data on women's perceived safety at the municipal, district, state, or any other required level. The tool would then standardize these scores, percentages, or statistics on a scale from 0 to 5, where 5 indicates the lowest level of violence or the highest level of perceived safety.

Hi @carolinamayh

Could you please elaborate on and/or provide examples of the file format of the perceived safety and disaggregated statistics data? Also, in this regard, could you please verify that the adjusted description text, as per the image, is correct?

carolinamayh · 2024-08-01T08:38:02Z

@osundwajeff @javaftw As mentioned yesterday, if streetlights data or nightlights data is not available, the user will have the option to input a safety score for the country or provide disaggregated scores by administrative units. The indicator used will be the number of homicides per 100,000 people, standardized based on the following scale:

Score 5 (Safest): 0 to 1 homicides per 100,000 people
Score 4: 1.1 to 3 homicides per 100,000 people
Score 3: 3.1 to 6 homicides per 100,000 people
Score 2: 6.1 to 10 homicides per 100,000 people
Score 1: 10.1 to 15 homicides per 100,000 people
Score 0 (Least Safe): More than 15 homicides per 100,000 people

javaftw · 2024-08-01T12:02:51Z

or provide disaggregated scores by administrative units

Hi @carolinamayh (Hennie here),

While the information about "the user will have the option to input a safety score for the country" is useful (there are UI elements in the plugin which performs this action), the part about "or provide disaggregated scores by administrative units" is unclear, despite the accompanying breakdown of the scoring mechanic.
Is is a shapefile? A csv? Something else? How is the user to interact with it and the fields it contains?
Could you please shed some more light on this, thanks!

javaftw · 2024-08-02T11:19:34Z

@osundwajeff https://github.com/kartoza/GEEST/tree/issue-53-factor-safety

dragosgontariu · 2024-08-07T12:39:34Z

there is no option for street lights as per specifications
there is no option for disaggregated data per administrative unit as per specifications and email communications
nighttime light data throws the following error:

2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:58 2024-08-07T12:32:59 2024-08-07T12:32:59 2024-08-07T12:32:59 Processing Copying nodata ...10...20...30...4 2024-08-07T12:32:59 Traceback (most recent call last):
File "C:\PROGRA~~1\QGIS 2024-08-07T12:32:59 File "C:\PROGRA~~1\QGIS 2024-08-07T12:32:59 File "C:\PROGRA~1\QGIS 2024-08-07T12:32:59 ImportError: INFO Results: {'OUTPUT': 'temp/countryUTMLayerBuf.shp'}
INFO gdalwarp -overwrite -t_srs EPSG:32620 -tr 100.0 100.0 -r near -te 705589.018038702 1514124.5410898936 731809.9074413507 1562729.1977703366 -te_srs EPSG:32620 -of GTiff D:\Munca\WB\Geest\St_Lucia\SHPs\Nighttime_lights_2021.tif temp/tempResample.tif
INFO GDAL command:
INFO gdalwarp -overwrite -t_srs EPSG:32620 -tr 100.0 100.0 -r near -te 705589.018038702 1514124.5410898936 731809.9074413507 1562729.1977703366 -te_srs EPSG:32620 -of GTiff D:\Munca\WB\Geest\St_Lucia\SHPs\Nighttime_lights_2021.tif temp/tempResample.tif
INFO GDAL command output:
INFO Process completed successfully
INFO Results: {'OUTPUT': 'temp/tempResample.tif'}
INFO gdal_calc.bat --overwrite --calc "A1000" --format GTiff --type Int32 -A temp/tempResample.tif --A_band 1 --outfile temp/tempCalc.tif
INFO GDAL command:
INFO gdal_calc.bat --overwrite --calc "A1000" --format GTiff --type Int32 -A temp/tempResample.tif --A_band 1 --outfile temp/tempCalc.tif
INFO GDAL command output:
CRITICAL Process returned error code 1
INFO Results: {'OUTPUT': 'temp/tempCalc.tif'}
INFO Creating output file that is 262P x 486L.
D:\Munca\WB\Geest\St_Lucia\SHPs\Nighttime_lights_2021.tif [1/1] : 0Using internal nodata values (e.g. 3.4e+38) for image D:\Munca\WB\Geest\St_Lucia\SHPs\Nighttime_lights_2021.tif.
values from source D:\Munca\WB\Geest\St_Lucia\SHPs\Nighttime_lights_2021.tif to destination temp/tempResample.tif.
0...50...60...70...80...90...100 - done.
CRITICAL RuntimeError: module compiled against API version 0x10 but this version of numpy is 0xf . Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem .
3.32.0\apps\Python39\Scripts\gdal_calc.py", line 8, in
CRITICAL from osgeo_utils.gdal_calc import * # noqa
3.32.0\apps\Python39\lib\site-packages\osgeo_utils\gdal_calc.py", line 46, in
CRITICAL from osgeo import gdal, gdal_array
3.32.0\apps\Python39\lib\site-packages\osgeo\gdal_array.py", line 13, in
CRITICAL from . import _gdal_array
numpy.core.multiarray failed to import

Traceback (most recent call last):
File "rasterio\_base.pyx", line 69, in rasterio._base.get_dataset_driver
File "rasterio\_err.pyx", line 221, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: temp/tempCalc.tif: No such file or directory

         During handling of the above exception, another exception occurred:
         
         Traceback (most recent call last):
          File "C:\Users/Dragos/AppData/Roaming/QGIS/QGIS3\profiles\default/python/plugins\gender_indicator_tool\gender_indicator_tool.py", line 2952, in SAFnightTimeLights
          with rasterio.open(tempCalc, "r+") as src:
          File "C:\Users\Dragos\AppData\Roaming\Python\Python39\site-packages\rasterio\env.py", line 451, in wrapper
          return f(*args, **kwds)
          File "C:\Users\Dragos\AppData\Roaming\Python\Python39\site-packages\rasterio\__init__.py", line 334, in open
          dataset = get_writer_for_path(path, driver=driver)(
          File "C:\Users\Dragos\AppData\Roaming\Python\Python39\site-packages\rasterio\io.py", line 299, in get_writer_for_path
          driver = get_dataset_driver(path)
          File "rasterio\\_base.pyx", line 74, in rasterio._base.get_dataset_driver
         TypeError: temp/tempCalc.tif: No such file or directory

@osundwajeff @mvmaltitz @javaftw

javaftw · 2024-08-13T13:53:16Z

Input Type: Raster

Input Type: Shapefile (Points):

Input Type: Shapefile (Polygons)

When the input is a polygon layer, the user may choose the field of interest within that layer to process. If the field contains text (and not numeric) values, the name of the field is appended with (text)

If the user chooses a text element in the dropdown, values may be assigned to the unique text fields

The result of the above operation, using user-assigned text field values:

The results of using a numeric field:

Input Type: User Value

If no input file is specified, the user value is evaluated:

javaftw · 2024-08-14T20:27:21Z

Improvement of point rendering method

ClaraIV · 2024-08-21T09:05:27Z

@osundwajeff @mvmaltitz we will provide detailed guidance on the processing of the NTL data for this factor today;

bennyistanto · 2024-08-21T14:45:43Z

Enhanced Night Time Light (NTL) Classification for GEEST

Approach and Rationale

We propose refining the NTL classification in GEEST to better reflect local conditions, particularly for small island countries. Instead of using fixed thresholds or coverage percentages, we'll employ local statistics to create a more contextually relevant classification. This approach is supported by Elvidge et al. (2013), who emphasize the importance of considering local context in night-time light analysis.

Standard Classification Procedure:

Clip the global NTL raster to the country boundary.
Calculate statistics for the clipped raster: min, max, mean, median, and 75th percentile.
- The use of percentiles in NTL analysis is supported by Levin & Zhang (2017).
Apply the classification scheme based on local statistics.

Standard Classification Scheme:

GEEST Class	Description	NTL Value Range
0	No Access	0 - 0.05
1	Very Low	> 0.05 - 0.25 * Median
2	Low	> 0.25 * Median - 0.5 * Median
3	Moderate	> 0.5 * Median - Median
4	High	> Median - 75th percentile
5	Very High	> 75th percentile

This classification approach aligns with the methods used by Chen & Nordhaus (2011) in using luminosity data as a proxy for socio-economic factors.

QGIS Implementation:

For implementation in QGIS, use the Raster Calculator with the following expression:

(NTL <= 0.05) * 0 + 
(NTL > 0.05 AND NTL <= [0.25 * median]) * 1 + 
(NTL > [0.25 * median] AND NTL <= [0.5 * median]) * 2 + 
(NTL > [0.5 * median] AND NTL <= [median]) * 3 + 
(NTL > [median] AND NTL <= [75th_percentile]) * 4 + 
(NTL > [75th_percentile]) * 5

Replace [median] and [75th_percentile] with the actual calculated values.

Example Python Script for Calculating Statistics:

Here's a simple Python script to calculate the required statistics for a GeoTIFF raster:

import rasterio
import numpy as np

def calculate_raster_stats(raster_path):
    with rasterio.open(raster_path) as src:
        data = src.read(1)
        data = data[data != src.nodata]  # Remove no data values
        
        min_val = np.min(data)
        max_val = np.max(data)
        mean_val = np.mean(data)
        median_val = np.median(data)
        percentile_75 = np.percentile(data, 75)
        
        return {
            'min': min_val,
            'max': max_val,
            'mean': mean_val,
            'median': median_val,
            '75th_percentile': percentile_75
        }

# Usage
raster_path = 'path/to/your/iso3_ntl.tif'
stats = calculate_raster_stats(raster_path)
print(stats)

This script can be integrated into the GEEST QGIS plugin to automatically calculate the statistics needed for the classification.

Benefits:

Adapts to local lighting conditions
Provides meaningful classification for areas with varying NTL intensities
Maintains consistency with the original GEEST approach while improving granularity

Additional Note: Handling Areas with Low NTL Values:

If the NTL values within the Area of Interest (AOI) are predominantly or entirely below 0.05, which is our baseline for electricity access, we need to adjust our approach. Here's how to proceed:

Check the data range:
After clipping the NTL raster to our AOI, examine the statistics, particularly the maximum value.
If max value < 0.05:
- This indicates that the entire area falls below our standard threshold for electricity access.
- In this case, we need to create a relative scale based on the available data. This approach is informed by Small et al. (2005), who discuss methods for analyzing areas with low light emissions.
Adjusted classification for low-value areas:
Use this modified classification scheme based on the local maximum value:

Low NTL Classification Scheme:

GEEST Class	Description	NTL Value Range
0	No Light	0
1	Very Low	> 0 - 0.2 * max_value
2	Low	> 0.2 * max_value - 0.4 * max_value
3	Moderate	> 0.4 * max_value - 0.6 * max_value
4	High	> 0.6 * max_value - 0.8 * max_value
5	Highest	> 0.8 * max_value - max_value

This adaptive thresholding approach is conceptually similar to methods described by Otsu (1979) for image classification.

QGIS Implementation:

For implementation in QGIS, use the Raster Calculator with the following expression:

with_variable('max_val', maximum("NTL@1"),
    (NTL = 0) * 0 + 
    (NTL > 0 AND NTL <= @max_val * 0.2) * 1 + 
    (NTL > @max_val * 0.2 AND NTL <= @max_val * 0.4) * 2 + 
    (NTL > @max_val * 0.4 AND NTL <= @max_val * 0.6) * 3 + 
    (NTL > @max_val * 0.6 AND NTL <= @max_val * 0.8) * 4 + 
    (NTL > @max_val * 0.8) * 5
)

Note: Replace "NTL@1" with our actual raster layer name.

Additional Considerations:

Temporal Variability:

It's important to note that NTL data can vary seasonally and annually. As Xie et al. (2019) demonstrate, temporal variations in artificial nighttime lights can provide insights into urbanization patterns and socio-economic changes. For GEEST, consider using multi-year averaged NTL data to mitigate short-term fluctuations and capture more stable patterns of lighting infrastructure.

Limitations and Caveats:

Users of this enhanced NTL classification in GEEST should be aware of certain limitations:

Overglow Effect: NTL data can suffer from the "overglow" effect, where light appears to extend beyond its actual source (Small et al., 2005). This may lead to overestimation of lit areas in some regions.
Saturation in Urban Centers: In highly urbanized areas, NTL sensors may saturate, leading to a loss of differentiation in the brightest areas (Elvidge et al., 2013).
Rural Underrepresentation: In sparsely populated or rural areas, the resolution of NTL data may not capture small-scale lighting infrastructure adequately.

References:

Elvidge, C. D., Baugh, K. E., Zhizhin, M., & Hsu, F.-C. (2013). Why VIIRS data are superior to DMSP for mapping nighttime lights. Proceedings of the Asia-Pacific Advanced Network, 35(0), 62. https://doi.org/10.7125/APAN.35.7
Levin, N., & Zhang, Q. (2017). A global analysis of factors controlling VIIRS nighttime light levels from densely populated areas. Remote Sensing of Environment, 190, 366–382. https://doi.org/10.1016/j.rse.2017.01.006
Small, C., Pozzi, F., & Elvidge, C. D. (2005). Spatial analysis of global urban extent from DMSP-OLS night lights. Remote Sensing of Environment, 96(3-4), 277-291. https://doi.org/10.1016/j.rse.2005.02.002
Chen, X., & Nordhaus, W. D. (2011). Using luminosity data as a proxy for economic statistics. Proceedings of the National Academy of Sciences, 108(21), 8589-8594. https://doi.org/10.1073/pnas.1017031108
N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," in IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, Jan. 1979, https://doi.org/10.1109/TSMC.1979.4310076
Xie, Y., Weng, Q., & Fu, P. (2019). Temporal variations of artificial nighttime lights and their implications for urbanization in the conterminous United States, 2013–2017. Remote Sensing of Environment, 225, 160-174. https://doi.org/10.1016/j.rse.2019.03.008

ClaraIV · 2024-08-22T06:48:47Z

@osundwajeff @mvmaltitz please see above, grateful if you could implement the NTL processing accordingly. I will move this to Work in Progress, thank you!

javaftw · 2024-08-22T07:04:12Z

Good morning Benny, Thank you for this information. Just a few quick questions: - Can you confirm the source and resolution of the NTL data to be used? Is it already available, or do I need to obtain it? - Are there any specific guidelines for the statistics (esp. handling outliers, excluding certain values)? - What should be done if the calculated median/90th percentile is very close or equal to the max value? Kind regards, Hennie Kotze

…

On Wed, Aug 21, 2024 at 4:46 PM Benny Istanto ***@***.***> wrote: Enhanced Night Time Light (NTL) Classification for GEEST Approach and Rationale We propose refining the NTL classification in GEEST to better reflect local conditions, particularly for small island countries. Instead of using fixed thresholds or coverage percentages, we'll employ local statistics to create a more contextually relevant classification. This approach is supported by Elvidge et al. (2013), who emphasize the importance of considering local context in night-time light analysis. Standard Classification Procedure: 1. Clip the global NTL raster to the country boundary. 2. Calculate statistics for the clipped raster: min, max, mean, median, and 90th percentile. - The use of percentiles in NTL analysis is supported by Levin & Zhang (2017). 3. Apply the classification scheme based on local statistics. Standard Classification Scheme: GEEST Class Description NTL Value Range 0 No Access 0 - 0.05 1 Very Low > 0.05 - 0.25 * Median 2 Low > 0.25 * Median - 0.5 * Median 3 Moderate > 0.5 * Median - Median 4 High > Median - 90th percentile 5 Very High > 90th percentile This classification approach aligns with the methods used by Chen & Nordhaus (2011) in using luminosity data as a proxy for socio-economic factors. QGIS Implementation: For implementation in QGIS, use the Raster Calculator with the following expression: (NTL <= 0.05) * 0 + (NTL > 0.05 AND NTL <= [0.25 * median]) * 1 + (NTL > [0.25 * median] AND NTL <= [0.5 * median]) * 2 + (NTL > [0.5 * median] AND NTL <= [median]) * 3 + (NTL > [median] AND NTL <= [90th_percentile]) * 4 + (NTL > [90th_percentile]) * 5 Replace [median] and [90th_percentile] with the actual calculated values. Example Python Script for Calculating Statistics: Here's a simple Python script to calculate the required statistics for a GeoTIFF raster: import rasterio import numpy as np def calculate_raster_stats(raster_path): with rasterio.open(raster_path) as src: data = src.read(1) data = data[data != src.nodata] # Remove no data values min_val = np.min(data) max_val = np.max(data) mean_val = np.mean(data) median_val = np.median(data) percentile_90 = np.percentile(data, 90) return { 'min': min_val, 'max': max_val, 'mean': mean_val, 'median': median_val, '90th_percentile': percentile_90 } # Usage raster_path = 'path/to/your/iso3_ntl.tif' stats = calculate_raster_stats(raster_path) print(stats) This script can be integrated into the GEEST QGIS plugin to automatically calculate the statistics needed for the classification. Benefits: - Adapts to local lighting conditions - Provides meaningful classification for areas with varying NTL intensities - Maintains consistency with the original GEEST approach while improving granularity Additional Note: Handling Areas with Low NTL Values: If the NTL values within the Area of Interest (AOI) are predominantly or entirely below 0.05, which is our baseline for electricity access, we need to adjust our approach. Here's how to proceed: 1. *Check the data range:* After clipping the NTL raster to our AOI, examine the statistics, particularly the maximum value. 2. *If max value < 0.05:* - This indicates that the entire area falls below our standard threshold for electricity access. - In this case, we need to create a relative scale based on the available data. This approach is informed by Small et al. (2005), who discuss methods for analyzing areas with low light emissions. 3. *Adjusted classification for low-value areas:* Use this modified classification scheme based on the local maximum value: Low NTL Classification Scheme: GEEST Class Description NTL Value Range 0 No Light 0 1 Very Low > 0 - 0.2 * max_value 2 Low > 0.2 * max_value - 0.4 * max_value 3 Moderate > 0.4 * max_value - 0.6 * max_value 4 High > 0.6 * max_value - 0.8 * max_value 5 Highest > 0.8 * max_value - max_value This adaptive thresholding approach is conceptually similar to methods described by Otsu (1979) for image classification. QGIS Implementation: For implementation in QGIS, use the Raster Calculator with the following expression: with_variable('max_val', ***@***.***"), (NTL = 0) * 0 + (NTL > 0 AND NTL <= @max_val * 0.2) * 1 + (NTL > @max_val * 0.2 AND NTL <= @max_val * 0.4) * 2 + (NTL > @max_val * 0.4 AND NTL <= @max_val * 0.6) * 3 + (NTL > @max_val * 0.6 AND NTL <= @max_val * 0.8) * 4 + (NTL > @max_val * 0.8) * 5 ) Note: Replace ***@***.***" with our actual raster layer name. Additional Considerations: Temporal Variability: It's important to note that NTL data can vary seasonally and annually. As Xie et al. (2019) demonstrate, temporal variations in artificial nighttime lights can provide insights into urbanization patterns and socio-economic changes. For GEEST, consider using multi-year averaged NTL data to mitigate short-term fluctuations and capture more stable patterns of lighting infrastructure. Limitations and Caveats: Users of this enhanced NTL classification in GEEST should be aware of certain limitations: - Overglow Effect: NTL data can suffer from the "overglow" effect, where light appears to extend beyond its actual source (Small et al., 2005). This may lead to overestimation of lit areas in some regions. - Saturation in Urban Centers: In highly urbanized areas, NTL sensors may saturate, leading to a loss of differentiation in the brightest areas (Elvidge et al., 2013). - Rural Underrepresentation: In sparsely populated or rural areas, the resolution of NTL data may not capture small-scale lighting infrastructure adequately. References: 1. Elvidge, C. D., Baugh, K. E., Zhizhin, M., & Hsu, F.-C. (2013). Why VIIRS data are superior to DMSP for mapping nighttime lights. Proceedings of the Asia-Pacific Advanced Network, 35(0), 62. https://doi.org/10.7125/APAN.35.7 2. Levin, N., & Zhang, Q. (2017). A global analysis of factors controlling VIIRS nighttime light levels from densely populated areas. Remote Sensing of Environment, 190, 366–382. https://doi.org/10.1016/j.rse.2017.01.006 3. Small, C., Pozzi, F., & Elvidge, C. D. (2005). Spatial analysis of global urban extent from DMSP-OLS night lights. Remote Sensing of Environment, 96(3-4), 277-291. https://doi.org/10.1016/j.rse.2005.02.002 4. Chen, X., & Nordhaus, W. D. (2011). Using luminosity data as a proxy for economic statistics. Proceedings of the National Academy of Sciences, 108(21), 8589-8594. https://doi.org/10.1073/pnas.1017031108 5. N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," in IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, Jan. 1979, https://doi.org/10.1109/TSMC.1979.4310076 6. Xie, Y., Weng, Q., & Fu, P. (2019). Temporal variations of artificial nighttime lights and their implications for urbanization in the conterminous United States, 2013–2017. Remote Sensing of Environment, 225, 160-174. https://doi.org/10.1016/j.rse.2019.03.008 — Reply to this email directly, view it on GitHub <#53 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANEBKFENYEJATB2UQINTIG3ZSSR23AVCNFSM6AAAAABJV65GZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBSGIZTAOJRGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

carolinamayh · 2024-08-22T07:10:17Z

@osundwajeff, please note that the score 5 ( Very High) is > 75th percentile

dragosgontariu · 2024-08-22T09:22:30Z

@osundwajeff @mvmaltitz @javaftw

When raster data input (nighttime lights) it still gets:

Traceback (most recent call last):
File "C:\Users/Dragos/AppData/Roaming/QGIS/QGIS3\profiles\default/python/plugins\gender_indicator_tool\gender_indicator_tool.py", line 3266, in SAFRasterizerDelegator
self.SAFnightTimeLights()
File "C:\Users/Dragos/AppData/Roaming/QGIS/QGIS3\profiles\default/python/plugins\gender_indicator_tool\gender_indicator_tool.py", line 3550, in SAFnightTimeLights
with rasterio.open(tempCalc, "r+") as src:
File "C:\Users\Dragos\AppData\Roaming\Python\Python39\site-packages\rasterio\env.py", line 451, in wrapper
return f(*args, **kwds)
File "C:\Users\Dragos\AppData\Roaming\Python\Python39\site-packages\rasterio_init_.py", line 334, in open
dataset = get_writer_for_path(path, driver=driver)(
File "C:\Users\Dragos\AppData\Roaming\Python\Python39\site-packages\rasterio\io.py", line 299, in get_writer_for_path
driver = get_dataset_driver(path)
File "rasterio\_base.pyx", line 74, in rasterio._base.get_dataset_driver
TypeError: temp/tempCalc.tif: No such file or directory

Also for the points input it gets this:

where we can see that there are street lights but no data at all

bennyistanto · 2024-08-22T13:02:20Z

Good morning Benny, Thank you for this information. Just a few quick questions: - Can you confirm the source and resolution of the NTL data to be used? Is it already available, or do I need to obtain it? - Are there any specific guidelines for the statistics (esp. handling outliers, excluding certain values)? - What should be done if the calculated median/90th percentile is very close or equal to the max value? Kind regards, Hennie Kotze

Hi Hennie

Answering your question below:

After discussed with @ClaraIV, for now let's use the NTL data from this link https://eogdata.mines.edu/nighttime_light/annual/v22/2023/VNL_npp_2023_global_vcmslcfg_v2_c202402081600.average.dat.tif.gz This is the latest annual global composite for 2023, it's 9GB compressed file, and become 11GB as GeoTIFF

You can use the clipped version below for LCA testing
lca_VNL_npp_2023_global_vcmslcfg_v2_c202402081600.average.dat.zip

As the data is categorized as ready-to-use, I don't think there is any additional process to use the data. From the website, handling outlier and other issue are more focused on daily data.
For the last question on median/75th percentile value are very close to the max. I create below flowchart to provides a clear visual representation of the decision-making process in the classification of NTL data, accounting for different scenarios based on the data characteristics.

graph TD
    A[Download NTL data] --> B[Gunzip data]
    B --> C[Clip using country boundary]
    C --> D[Check max_value]
    D -->|max_value > 0.05| E[Check median and 75th percentile]
    D -->|max_value <= 0.05| F[Use modified classification scheme]
    E -->|Difference > 5% of max_value| G[Use original classification scheme]
    E -->|Difference <= 5% of max_value| F
    G -->|Use median and percentile| H[Classify NTL data]
    F -->|Use max_value| H
    H --> I[Output classified NTL map]

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333,stroke-width:2px
    style C fill:#bbf,stroke:#333,stroke-width:2px
    style D fill:#fbb,stroke:#333,stroke-width:2px
    style E fill:#fbb,stroke:#333,stroke-width:2px
    style F fill:#bfb,stroke:#333,stroke-width:2px
    style G fill:#bfb,stroke:#333,stroke-width:2px
    style H fill:#fbf,stroke:#333,stroke-width:2px
    style I fill:#ff9,stroke:#333,stroke-width:2px

This diagram illustrates the process as follows:

Download the NTL data from the provided URL.
Gunzip the downloaded data.
Clip the data using the country boundary.
Check the max_value of the clipped NTL data.
If max_value > 0.05:
- Check if the median or 75th percentile is within 5% of the max_value.
- If the difference is > 5% of max_value, use the original classification scheme (median and percentile based).
- If the difference is <= 5% of max_value, use the modified classification scheme.
If max_value <= 0.05:
- Use the modified classification scheme directly.
Classify the NTL data using the selected scheme.
Output the classified NTL map.

Original Classification Scheme:

GEEST Class	Description	NTL Value Range
0	No Access	0 - 0.05
1	Very Low	> 0.05 - 0.25 * Median
2	Low	> 0.25 * Median - 0.5 * Median
3	Moderate	> 0.5 * Median - Median
4	High	> Median - 75th percentile
5	Very High	> 75th percentile

Modified Classification Scheme:

GEEST Class	Description	NTL Value Range
0	No Light	0
1	Very Low	> 0 - 0.2 * max_value
2	Low	> 0.2 * max_value - 0.4 * max_value
3	Moderate	> 0.4 * max_value - 0.6 * max_value
4	High	> 0.6 * max_value - 0.8 * max_value
5	Highest	> 0.8 * max_value - max_value

mvmaltitz · 2024-08-23T12:07:36Z

@dragosgontariu
Test Results:
Perceived Safety Value = 80

Street lights (points)

Nighttime lights (raster)

bennyistanto · 2024-08-23T13:40:58Z

@mvmaltitz I've obtained different results for the NTL classification for LCA. Here are my findings:

Max Value: 28.618135
Median: 1.060158
75th Percentile: 2.263107

The resulting classification map is shown below:

You can review the code I used to generate this map here.

Could you please compare these results with your findings? I'm particularly interested in understanding any discrepancies in the classification thresholds or the final map output.

ClaraIV · 2024-08-26T08:19:43Z

@mvmaltitz @osundwajeff please ensure that the descriptive text on the tab is updated and explains briefly the processing of each data type. Also, please rename this factor "Safety" as per the indications in the Indicators excel --> remove "safe urban design"

javaftw · 2024-08-26T11:56:44Z

@bennyistanto Hi Benny
I have modified the classification algorithm as per your instructions.

Modified algo + original supplied input data data:

Input:

Output:

Modified algo + downloaded input data data:

(Data download source )
Input:

Output:

Code snippet:

                # Get valid data (non-NaN values)
                valid_data = ntl_data[~np.isnan(ntl_data)]

                max_value = np.max(valid_data)
                median = np.median(valid_data)
                percentile_75 = np.percentile(valid_data, 75)

                print(f"Max Value: {max_value:.6f}")
                print(f"Median: {median:.6f}")
                print(f"75th Percentile: {percentile_75:.6f}")

                # Determine classification scheme
                if max_value <= 0.05 or (max_value - percentile_75) <= 0.05 * max_value:
                    print("Using max_value classification scheme")
                    classification = np.full_like(ntl_data, 0, dtype=np.uint8)
                    classification[(ntl_data > 0) & (ntl_data <= 0.2 * max_value)] = 1
                    classification[(ntl_data > 0.2 * max_value) & (ntl_data <= 0.4 * max_value)] = 2
                    classification[(ntl_data > 0.4 * max_value) & (ntl_data <= 0.6 * max_value)] = 3
                    classification[(ntl_data > 0.6 * max_value) & (ntl_data <= 0.8 * max_value)] = 4
                    classification[ntl_data > 0.8 * max_value] = 5
                else:
                    print("Using original classification scheme")
                    classification = np.full_like(ntl_data, 0, dtype=np.uint8)
                    classification[(ntl_data > 0.05) & (ntl_data <= 0.25 * median)] = 1
                    classification[(ntl_data > 0.25 * median) & (ntl_data <= 0.5 * median)] = 2
                    classification[(ntl_data > 0.5 * median) & (ntl_data <= median)] = 3
                    classification[(ntl_data > median) & (ntl_data <= percentile_75)] = 4
                    classification[ntl_data > percentile_75] = 5

                # Set NaN values in the original data to 255
                classification[np.isnan(ntl_data)] = 255

                # Update metadata for output
                out_meta.update(dtype=rasterio.uint8, nodata=255)

Using the same approach in both cases using different but very similar input data, the results appear to be significantly different.

bennyistanto · 2024-08-27T13:19:58Z

Hey Hennie @javaftw,

I've figured out why our maps look different. The classification patterns we're seeing on the map are different because we're getting different statistical values, especially for the median and 75_percentile.

Here's what I'm guessing you got:

Max Value: 28.618135
Median: 0.458617
75th Percentile: 1.280255

And here's what I'm seeing:

Max Value: 28.618135
Median: 1.060158
75th Percentile: 2.263107

The reason for this difference? It's probably how we're dealing with those pesky nodata values.

When we clip the raster with the country boundary, we might end up with areas outside the boundary that aren't properly marked as nodata. If these areas are included as zeros in our calculations, it could potentially lower our median and 75_percentile values, depending on how many of these zero values there are compared to the valid data points inside the country boundary.

I took a look at the GitHub code (https://github.com/kartoza/GEEST/blob/dev/gender_indicator_tool/gender_indicator_tool.py, lines 3640-3685), and I noticed it's following the Colab script. The thing is, Colab handles NaN values correctly during clipping, but in the gender_indicator_tool.py, especially in the gdal:cliprasterbymasklayer part, nodata is set to 0.

I tried to make the Colab script work in the PyQGIS console by tweaking how we set the NODATA value. The goal was to make sure the areas outside the boundary are properly recognized as NODATA. I've got the full code below if you want to take a look.

https://gist.github.com/bennyistanto/7ec367ba090468ca0e6e888dd13f7f3e

After testing in both Colab and PyQGIS with this updated script, I'm getting the same stats for min, max, median, and 75_percentile.

QGIS test

import numpy as np
from qgis.core import QgsRasterLayer

# File path to your raster
raster_path = r'D:\temp\geest\factor_safety\ntl\lca_VNL_npp_2023_global_vcmslcfg_v2_c202402081600.average.dat.tif'

# Load the raster layer
raster_layer = QgsRasterLayer(raster_path, "NTL")

# Check if the raster layer is valid
if not raster_layer.isValid():
    print("Error: Raster failed to load.")
else:
    provider = raster_layer.dataProvider()
    extent = raster_layer.extent()
    cols = raster_layer.width()
    rows = raster_layer.height()
    block = provider.block(1, extent, cols, rows)
    
    # Convert raster block to numpy array
    data = np.array([[block.value(i, j) for j in range(cols)] for i in range(rows)])
    
    # Handle nodata values (non-NaN)
    nodata = provider.sourceNoDataValue(1)
    valid_data = data[~np.isnan(data) & (data != nodata)]
    
    if valid_data.size > 0:
        # Calculate statistics
        min_value = np.min(valid_data)
        max_value = np.max(valid_data)
        median = np.median(valid_data)
        percentile_75 = np.percentile(valid_data, 75)
        
        print(f"Min Value: {min_value:.6f}")
        print(f"Max Value: {max_value:.6f}")
        print(f"Median: {median:.6f}")
        print(f"75th Percentile: {percentile_75:.6f}")
    else:
        print("No valid data found.")

Colab test

import numpy as np
import rasterio

def analyze_raster(raster_path):
    # Open the raster file
    with rasterio.open(raster_path) as src:
        data = src.read(1)  # Read the first band

        # Retrieve the nodata value from the raster metadata
        nodata = src.nodata

        # Get valid data (non-NaN values and not nodata)
        valid_data = data[~np.isnan(data) & (data != nodata)]

        if valid_data.size > 0:
            # Calculate statistics
            min_value = np.min(valid_data)
            max_value = np.max(valid_data)
            median = np.median(valid_data)
            percentile_75 = np.percentile(valid_data, 75)

            print(f"Min Value: {min_value:.6f}")
            print(f"Max Value: {max_value:.6f}")
            print(f"Median: {median:.6f}")
            print(f"75th Percentile: {percentile_75:.6f}")
        else:
            print("No valid data found.")

# Path to your raster file
raster_path = f'{dir}/ntl/lca_VNL_npp_2023_global_vcmslcfg_v2_c202402081600.average.dat.tif'

# Run the analysis
analyze_raster(raster_path)

Both script returned the same value:

Min Value: 0.329669
Max Value: 28.618135
Median: 1.060158
75th Percentile: 2.263107

What do you think? Does this help explain the differences we're seeing?

mvmaltitz · 2024-09-23T10:31:10Z

@timlinux

Safety trigger a warning on OGR when open the NTL data, although the data is available and loaded into map, dragged from Explorer window.

2024-09-12T21:05:41     WARNING    Cannot open D:\temp\geest\geest_qgis_templates\layers\iso3\fji_input\fji_place_saf_ntl_2023_8859_clipped.tif ().()

After Execute the button, the label next to the button appear with text:

Processing failed: Unable to execute algorithm
Could not load source layer for INPUT: temp\clippedClassified.tif not found

See below the python warning

2024-09-12T21:09:16     WARNING    warning:C:\PROGRA~1\QGIS33~1.4\apps\Python39\lib\site-packages\osgeo\ogr.py:596: FutureWarning: Neither ogr.UseExceptions() nor ogr.DontUseExceptions() has been explicitly called. In GDAL 4.0, exceptions will be enabled by default.
              warnings.warn(
             
             traceback: File "C:\Users/benny/AppData/Roaming/QGIS/QGIS3\profiles\default/python/plugins\gender_indicator_tool\gender_indicator_tool.py", line 3304, in SAFRasterizerDelegator
              self.SAFnightTimeLights_v2()
              File "C:\Users/benny/AppData/Roaming/QGIS/QGIS3\profiles\default/python/plugins\gender_indicator_tool\gender_indicator_tool.py", line 3740, in SAFnightTimeLights_v2
              processing.run(
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\tools\general.py", line 109, in run
              return Processing.runAlgorithm(algOrName, parameters, onFinish, feedback, context)
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\core\Processing.py", line 176, in runAlgorithm
              ret, results = execute(alg, parameters, context, feedback, catch_exceptions=False)
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\gui\AlgorithmExecutor.py", line 70, in execute
              results, ok = alg.run(parameters, context, feedback, {}, False)
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\algs\gdal\GdalAlgorithm.py", line 132, in processAlgorithm
              commands = self.getConsoleCommands(parameters, context, feedback, executing=True)
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\algs\gdal\ClipRasterByMask.py", line 170, in getConsoleCommands
              maskLayer, maskLayerName = self.getOgrCompatibleSource(self.MASK, parameters, context, feedback, executing)
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\algs\gdal\GdalAlgorithm.py", line 89, in getOgrCompatibleSource
              ogr_layer_name = GdalUtils.ogrLayerName(ogr_data_path)
              File "C:\PROGRA~1/QGIS33~1.4/apps/qgis-ltr/./python/plugins\processing\algs\gdal\GdalUtils.py", line 439, in ogrLayerName
              ds = ogr.Open(basePath)
              File "C:\PROGRA~1\QGIS33~1.4\apps\Python39\lib\site-packages\osgeo\ogr.py", line 7481, in Open
              _WarnIfUserHasNotSpecifiedIfUsingExceptions()
              File "C:\PROGRA~1\QGIS33~1.4\apps\Python39\lib\site-packages\osgeo\ogr.py", line 596, in _WarnIfUserHasNotSpecifiedIfUsingExceptions
              warnings.warn(

And below the Processing log

2024-09-12T21:09:16     INFO    Results: {'OUTPUT': 'output_e54fbcd2_4078_43f6_b086_52d0e9371988'}
2024-09-12T21:09:16     INFO    gdalwarp -overwrite -of GTiff -cutline C:/TEMP/processing_JvEyfI/72efe44926e247ec969f00d98a4ce638/MASK.gpkg -cl MASK -crop_to_cutline -dstnodata -3.4028234663852886e+38 temp\tempClassified.tif temp\clippedClassified.tif
2024-09-12T21:09:16     INFO    GDAL command:
2024-09-12T21:09:16     INFO    gdalwarp -overwrite -of GTiff -cutline C:/TEMP/processing_JvEyfI/72efe44926e247ec969f00d98a4ce638/MASK.gpkg -cl MASK -crop_to_cutline -dstnodata -3.4028234663852886e+38 temp\tempClassified.tif temp\clippedClassified.tif
2024-09-12T21:09:16     INFO    GDAL command output:
2024-09-12T21:09:17     CRITICAL    Process returned error code 1
2024-09-12T21:09:17     INFO    Results: {'OUTPUT': 'temp\\clippedClassified.tif'}
2024-09-12T21:09:20     INFO    Results: {'OUTPUT': 'temp/countryUTMLayerBuf.shp'}
2024-09-12T21:09:22     CRITICAL    Unable to execute algorithm
             Could not load source layer for INPUT: temp\clippedClassified.tif not found
2024-09-12T21:09:22     CRITICAL    Warning 1: Ring Self-intersection at or near point 2617160.3599800956 -2438488.8009277876
2024-09-12T21:09:22     CRITICAL    ERROR 1: Cutline polygon is invalid.

And OGR log

2024-09-12T21:05:41     WARNING    Cannot open D:\temp\geest\geest_qgis_templates\layers\iso3\fji_input\fji_place_saf_ntl_2023_8859_clipped.tif ().()
2024-09-12T21:09:16     WARNING    Cannot open EPSG:8859 ().()

See tickets #134 and #135 for full testing, results and input data

timlinux added the 🔍️ Factor Analysis factors label Jun 21, 2024

amyburness added the 🏙️ Place characterization Dimension - place characterization label Jun 21, 2024

osundwajeff self-assigned this Jul 22, 2024

osundwajeff added the Size 3 It will take me between 2 hours to half a day label Jul 22, 2024

osundwajeff assigned javaftw and unassigned osundwajeff Jul 30, 2024

mvmaltitz added this to the Indicator Enhancements milestone Aug 13, 2024

mvmaltitz removed the Size 3 It will take me between 2 hours to half a day label Aug 15, 2024

mvmaltitz changed the title ~~Factor: Safety- Street lights/Nigthttime ligths~~ Factor: Safety- Street lights/Nighttime lights Aug 22, 2024

osundwajeff mentioned this issue Aug 28, 2024

Test review v4 #115

Merged

mvmaltitz mentioned this issue Sep 4, 2024

Indicator Enhancements #15

Open

25 tasks

hennie-k assigned hennie-k and unassigned javaftw Sep 5, 2024

mvmaltitz assigned timlinux Sep 23, 2024

mvmaltitz mentioned this issue Sep 23, 2024

Country Test Result #134

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factor: Safety- Street lights/Nighttime lights #53

Factor: Safety- Street lights/Nighttime lights #53

amyburness commented Jun 21, 2024 •

edited

Loading

carolinamayh commented Jul 1, 2024 •

edited

Loading

javaftw commented Jul 24, 2024 •

edited

Loading

javaftw commented Jul 24, 2024

osundwajeff commented Jul 24, 2024

javaftw commented Jul 24, 2024

javaftw commented Jul 31, 2024

carolinamayh commented Aug 1, 2024

javaftw commented Aug 1, 2024

javaftw commented Aug 2, 2024

dragosgontariu commented Aug 7, 2024

javaftw commented Aug 13, 2024

javaftw commented Aug 14, 2024

ClaraIV commented Aug 21, 2024

bennyistanto commented Aug 21, 2024 •

edited

Loading

ClaraIV commented Aug 22, 2024

javaftw commented Aug 22, 2024 via email

carolinamayh commented Aug 22, 2024

dragosgontariu commented Aug 22, 2024 •

edited

Loading

bennyistanto commented Aug 22, 2024

mvmaltitz commented Aug 23, 2024

bennyistanto commented Aug 23, 2024

ClaraIV commented Aug 26, 2024

javaftw commented Aug 26, 2024

bennyistanto commented Aug 27, 2024

mvmaltitz commented Sep 23, 2024

Factor: Safety- Street lights/Nighttime lights #53

Factor: Safety- Street lights/Nighttime lights #53

Comments

amyburness commented Jun 21, 2024 • edited Loading

carolinamayh commented Jul 1, 2024 • edited Loading

javaftw commented Jul 24, 2024 • edited Loading

SAFStreetLight Function Overview:

Substitute Streetlights Data Generation

javaftw commented Jul 24, 2024

osundwajeff commented Jul 24, 2024

javaftw commented Jul 24, 2024

javaftw commented Jul 31, 2024

carolinamayh commented Aug 1, 2024

javaftw commented Aug 1, 2024

javaftw commented Aug 2, 2024

dragosgontariu commented Aug 7, 2024

javaftw commented Aug 13, 2024

javaftw commented Aug 14, 2024

ClaraIV commented Aug 21, 2024

bennyistanto commented Aug 21, 2024 • edited Loading

Enhanced Night Time Light (NTL) Classification for GEEST

Approach and Rationale

Standard Classification Procedure:

Standard Classification Scheme:

QGIS Implementation:

Example Python Script for Calculating Statistics:

Additional Note: Handling Areas with Low NTL Values:

Low NTL Classification Scheme:

QGIS Implementation:

Additional Considerations:

Temporal Variability:

Limitations and Caveats:

References:

ClaraIV commented Aug 22, 2024

javaftw commented Aug 22, 2024 via email

carolinamayh commented Aug 22, 2024

dragosgontariu commented Aug 22, 2024 • edited Loading

bennyistanto commented Aug 22, 2024

mvmaltitz commented Aug 23, 2024

bennyistanto commented Aug 23, 2024

ClaraIV commented Aug 26, 2024

javaftw commented Aug 26, 2024

Modified algo + original supplied input data data:

Modified algo + downloaded input data data:

bennyistanto commented Aug 27, 2024

mvmaltitz commented Sep 23, 2024

amyburness commented Jun 21, 2024 •

edited

Loading

carolinamayh commented Jul 1, 2024 •

edited

Loading

javaftw commented Jul 24, 2024 •

edited

Loading

bennyistanto commented Aug 21, 2024 •

edited

Loading

dragosgontariu commented Aug 22, 2024 •

edited

Loading