Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add S3 support #160

Closed
wants to merge 4 commits into from
Closed

Add S3 support #160

wants to merge 4 commits into from

Conversation

djhoese
Copy link
Contributor

@djhoese djhoese commented Dec 22, 2022

Replacement of #125 based on 4.9.0 (main).

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@djhoese djhoese changed the title Add S3 flags Add S3 support Dec 22, 2022
@djhoese djhoese mentioned this pull request Dec 22, 2022
@ocefpaf
Copy link
Member

ocefpaf commented Dec 22, 2022

Do you need an extra test to check if things are working or the ones we are already running are sufficient?

@djhoese
Copy link
Contributor Author

djhoese commented Dec 22, 2022

No idea.

@ocefpaf
Copy link
Member

ocefpaf commented Dec 22, 2022

No idea.

I'm tempted to but let's wait for @dopplershift who may have a better idea on the state of the tests. I did see a ZARR S3 clean up test in the logs. Maybe that is enough but I did not looked into that. (On mobile only.)

@djhoese
Copy link
Contributor Author

djhoese commented Dec 29, 2022

I tried building this PR locally and then loading an ABI file from AWS S3, but it doesn't seem to work:

>>> from netCDF4 import Dataset
>>> ds = Dataset("s3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc")
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <?xml^ version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc.dds</Key><RequestId>BRHYNP37VTMGRDRY</RequestId><HostId>idWfioBaic8Q8QyEnRfYB9yJxwDI1d2SzNLWe0ylBcuuw6eG4TRNQgmMB3jZzbCs0m+MhvPj5Mg=</HostId></Error>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "src/netCDF4/_netCDF4.pyx", line 2463, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2026, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -90] NetCDF: file not found: b's3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc'

This of course is using the Python netcdf4 library which uses libnetcdf underneath. I'm getting a different error than before this PR which was saying that S3 was not enabled, but now it doesn't seem to be able to complete the request because it is looking for a .dds file. Any ideas anyone?

@djhoese
Copy link
Contributor Author

djhoese commented Dec 29, 2022

Here's the same thing with ncdump and verbose curl output:

$ CURLOPT_VERBOSE=1 ncdump -h "s3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc"
*   Trying 52.217.70.174:443...
* Connected to s3.us-east-1.amazonaws.com (52.217.70.174) port 443 (#0)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /home/davidh/miniconda3/envs/libnetcdf-s3-test/ssl/cacert.pem
*  CApath: none
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: CN=s3.amazonaws.com
*  start date: Apr  1 00:00:00 2022 GMT
*  expire date: Mar 30 23:59:59 2023 GMT
*  subjectAltName: host "s3.us-east-1.amazonaws.com" matched cert's "s3.us-east-1.amazonaws.com"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
> GET /noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc.dds HTTP/1.1
Host: s3.us-east-1.amazonaws.com
User-Agent: oc4.9.0
Accept: */*

* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< x-amz-request-id: EJ5QKD5WKJMJ1WQ1
< x-amz-id-2: JWHbOpjLRcAvFJwT+eUD4O6jiuIXmyPkBYsDhIgHcumdHolD4LU/9r+voShzQD9FFdcGvrOFd1Q=
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Thu, 29 Dec 2022 20:52:19 GMT
< Server: AmazonS3
<
* Connection #0 to host s3.us-east-1.amazonaws.com left intact
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <?xml^ version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc.dds</Key><RequestId>EJ5QKD5WKJMJ1WQ1</RequestId><HostId>JWHbOpjLRcAvFJwT+eUD4O6jiuIXmyPkBYsDhIgHcumdHolD4LU/9r+voShzQD9FFdcGvrOFd1Q=</HostId></Error>
ncdump: s3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc: NetCDF: file not found

@dopplershift
Copy link
Member

Looks for a .dds is a sign it's trying to use the DAP support when it sees http, not S3--which is how it works by default.

Also, if all you're doing is accessing a a netcdf4 file in S3, not actually doing any zarr stuff, you compiling against the S3 SDK is unnecessary. netcdf-c has support for http byte-range requests for doing that which...sigh isn't enabled by default. I enabled it in #107, but apparently only for the now removed static builds. Oops. 🐑 (And information on setting this up and using it seems to be difficult to find information about.) To use it:

  1. Add -DENABLE_BYTERANGE=on to build.sh and bld.bat
  2. Append #mode=bytes to your url, like: s3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes

@dopplershift
Copy link
Member

cc @DennisHeimbigner @WardF

@DennisHeimbigner
Copy link

No idea.

The tests are probably adequate to test that S3 support is working,
but definitely not good enough to detect edge case problems.

@DennisHeimbigner
Copy link

The fact that it is treating it as a DAP request means that the model inference
code in libdispatch/dinfermodel.c is not working correctly. I will check.

@DennisHeimbigner
Copy link

In looking at the example above, something is unclear.

s3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc

Is this a netcdf-4/HDF5 formatted file?

@dopplershift
Copy link
Member

@DennisHeimbigner yes:

❯ wget -q https://noaa-goes16.s3.us-east-1.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc
❯ ncdump -h OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc
netcdf OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562 {
dimensions:
	y = 10848 ;
	x = 10848 ;
	number_of_time_bounds = 2 ;
	band = 1 ;
	number_of_image_bounds = 2 ;
	num_star_looks = 24 ;
variables:
	short Rad(y, x) ;
		Rad:_FillValue = 1023s ;
		Rad:long_name = "ABI L1b Radiances" ;
		Rad:standard_name = "toa_outgoing_radiance_per_unit_wavelength" ;
		Rad:_Unsigned = "true" ;
		Rad:sensor_band_bit_depth = 10b ;
		Rad:valid_range = 0s, 1022s ;
		Rad:scale_factor = 0.8121064f ;
		Rad:add_offset = -25.93665f ;
		Rad:units = "W m-2 sr-1 um-1" ;
		Rad:resolution = "y: 0.000028 rad x: 0.000028 rad" ;
		Rad:coordinates = "band_id band_wavelength t y x" ;
		Rad:grid_mapping = "goes_imager_projection" ;
		Rad:cell_methods = "t: point area: point" ;
		Rad:ancillary_variables = "DQF" ;
	byte DQF(y, x) ;
		DQF:_FillValue = -1b ;
		DQF:long_name = "ABI L1b Radiances data quality flags" ;
		DQF:standard_name = "status_flag" ;
		DQF:_Unsigned = "true" ;
		DQF:valid_range = 0b, 4b ;
		DQF:units = "1" ;
		DQF:coordinates = "band_id band_wavelength t y x" ;
		DQF:grid_mapping = "goes_imager_projection" ;
		DQF:cell_methods = "t: point area: point" ;
		DQF:flag_values = 0b, 1b, 2b, 3b, 4b ;
		DQF:flag_meanings = "good_pixel_qf conditionally_usable_pixel_qf out_of_range_pixel_qf no_value_pixel_qf focal_plane_temperature_threshold_exceeded_qf" ;
		DQF:number_of_qf_values = 5b ;
		DQF:percent_good_pixel_qf = 0.998999f ;
		DQF:percent_conditionally_usable_pixel_qf = 5.e-07f ;
		DQF:percent_out_of_range_pixel_qf = 0.0009925f ;
		DQF:percent_no_value_pixel_qf = 8.e-06f ;
		DQF:percent_focal_plane_temperature_threshold_exceeded_qf = 0.f ;
	double t ;
		t:long_name = "J2000 epoch mid-point between the start and end image scan in seconds" ;
		t:standard_name = "time" ;
		t:units = "seconds since 2000-01-01 12:00:00" ;
		t:axis = "T" ;
		t:bounds = "time_bounds" ;
	short y(y) ;
		y:scale_factor = -2.8e-05f ;
		y:add_offset = 0.151858f ;
		y:units = "rad" ;
		y:axis = "Y" ;
		y:long_name = "GOES fixed grid projection y-coordinate" ;
		y:standard_name = "projection_y_coordinate" ;
	short x(x) ;
		x:scale_factor = 2.8e-05f ;
		x:add_offset = -0.151858f ;
		x:units = "rad" ;
		x:axis = "X" ;
		x:long_name = "GOES fixed grid projection x-coordinate" ;
		x:standard_name = "projection_x_coordinate" ;
	double time_bounds(number_of_time_bounds) ;
		time_bounds:long_name = "Scan start and end times in seconds since epoch (2000-01-01 12:00:00)" ;
	int goes_imager_projection ;
		goes_imager_projection:long_name = "GOES-R ABI fixed grid projection" ;
		goes_imager_projection:grid_mapping_name = "geostationary" ;
		goes_imager_projection:perspective_point_height = 35786023. ;
		goes_imager_projection:semi_major_axis = 6378137. ;
		goes_imager_projection:semi_minor_axis = 6356752.31414 ;
		goes_imager_projection:inverse_flattening = 298.2572221 ;
		goes_imager_projection:latitude_of_projection_origin = 0. ;
		goes_imager_projection:longitude_of_projection_origin = -75. ;
		goes_imager_projection:sweep_angle_axis = "x" ;
	float y_image ;
		y_image:long_name = "GOES-R fixed grid projection y-coordinate center of image" ;
		y_image:standard_name = "projection_y_coordinate" ;
		y_image:units = "rad" ;
		y_image:axis = "Y" ;
	float y_image_bounds(number_of_image_bounds) ;
		y_image_bounds:long_name = "GOES-R fixed grid projection y-coordinate north/south extent of image" ;
		y_image_bounds:units = "rad" ;
	float x_image ;
		x_image:long_name = "GOES-R fixed grid projection x-coordinate center of image" ;
		x_image:standard_name = "projection_x_coordinate" ;
		x_image:units = "rad" ;
		x_image:axis = "X" ;
	float x_image_bounds(number_of_image_bounds) ;
		x_image_bounds:long_name = "GOES-R fixed grid projection x-coordinate west/east extent of image" ;
		x_image_bounds:units = "rad" ;
	float nominal_satellite_subpoint_lat ;
		nominal_satellite_subpoint_lat:long_name = "nominal satellite subpoint latitude (platform latitude)" ;
		nominal_satellite_subpoint_lat:standard_name = "latitude" ;
		nominal_satellite_subpoint_lat:_FillValue = -999.f ;
		nominal_satellite_subpoint_lat:units = "degrees_north" ;
	float nominal_satellite_subpoint_lon ;
		nominal_satellite_subpoint_lon:long_name = "nominal satellite subpoint longitude (platform longitude)" ;
		nominal_satellite_subpoint_lon:standard_name = "longitude" ;
		nominal_satellite_subpoint_lon:_FillValue = -999.f ;
		nominal_satellite_subpoint_lon:units = "degrees_east" ;
	float nominal_satellite_height ;
		nominal_satellite_height:long_name = "nominal satellite height above GRS 80 ellipsoid (platform altitude)" ;
		nominal_satellite_height:standard_name = "height_above_reference_ellipsoid" ;
		nominal_satellite_height:_FillValue = -999.f ;
		nominal_satellite_height:units = "km" ;
	float geospatial_lat_lon_extent ;
		geospatial_lat_lon_extent:long_name = "geospatial latitude and longitude references" ;
		geospatial_lat_lon_extent:geospatial_westbound_longitude = -156.2995f ;
		geospatial_lat_lon_extent:geospatial_northbound_latitude = 81.3282f ;
		geospatial_lat_lon_extent:geospatial_eastbound_longitude = 6.2995f ;
		geospatial_lat_lon_extent:geospatial_southbound_latitude = -81.3282f ;
		geospatial_lat_lon_extent:geospatial_lat_center = 0.f ;
		geospatial_lat_lon_extent:geospatial_lon_center = -75.f ;
		geospatial_lat_lon_extent:geospatial_lat_nadir = 0.f ;
		geospatial_lat_lon_extent:geospatial_lon_nadir = -75.f ;
		geospatial_lat_lon_extent:geospatial_lat_units = "degrees_north" ;
		geospatial_lat_lon_extent:geospatial_lon_units = "degrees_east" ;
	byte yaw_flip_flag ;
		yaw_flip_flag:long_name = "Flag indicating the spacecraft is operating in yaw flip configuration" ;
		yaw_flip_flag:_Unsigned = "true" ;
		yaw_flip_flag:_FillValue = -1b ;
		yaw_flip_flag:valid_range = 0b, 1b ;
		yaw_flip_flag:units = "1" ;
		yaw_flip_flag:coordinates = "t" ;
		yaw_flip_flag:flag_values = 0b, 1b ;
		yaw_flip_flag:flag_meanings = "false true" ;
	byte band_id(band) ;
		band_id:long_name = "ABI band number" ;
		band_id:standard_name = "sensor_band_identifier" ;
		band_id:units = "1" ;
	float band_wavelength(band) ;
		band_wavelength:long_name = "ABI band central wavelength" ;
		band_wavelength:standard_name = "sensor_band_central_radiation_wavelength" ;
		band_wavelength:units = "um" ;
	float esun ;
		esun:long_name = "bandpass-weighted solar irradiance at the mean Earth-Sun distance" ;
		esun:standard_name = "toa_shortwave_irradiance_per_unit_wavelength" ;
		esun:_FillValue = -999.f ;
		esun:units = "W m-2 um-1" ;
		esun:coordinates = "band_id band_wavelength t" ;
		esun:cell_methods = "t: mean" ;
	float kappa0 ;
		kappa0:long_name = "Inverse of the incoming top of atmosphere radiance at current earth-sun distance (PI d2 esun-1)-1, where d is the ratio of instantaneous Earth-Sun distance divided by the mean Earth-Sun distance, esun is the bandpass-weighted solar irradiance and PI is a standard constant used to convert ABI L1b radiance to reflectance" ;
		kappa0:_FillValue = -999.f ;
		kappa0:units = "(W m-2 um-1)-1" ;
		kappa0:coordinates = "band_id band_wavelength t" ;
		kappa0:cell_methods = "t: mean" ;
	float planck_fk1 ;
		planck_fk1:long_name = "wavenumber-dependent coefficient (2 h c2/ nu3) used in the ABI emissive band monochromatic brightness temperature computation, where nu =central wavenumber and h and c are standard constants" ;
		planck_fk1:_FillValue = -999.f ;
		planck_fk1:units = "W m-1" ;
		planck_fk1:coordinates = "band_id band_wavelength" ;
	float planck_fk2 ;
		planck_fk2:long_name = "wavenumber-dependent coefficient (h c nu/b) used in the ABI emissive band monochromatic brightness temperature computation, where nu = central wavenumber and h, c, and b are standard constants" ;
		planck_fk2:_FillValue = -999.f ;
		planck_fk2:units = "K" ;
		planck_fk2:coordinates = "band_id band_wavelength" ;
	float planck_bc1 ;
		planck_bc1:long_name = "spectral bandpass correction offset for brightness temperature (B(nu) - bc_1)/bc_2 where B()=planck_function() and nu=wavenumber" ;
		planck_bc1:_FillValue = -999.f ;
		planck_bc1:units = "K" ;
		planck_bc1:coordinates = "band_id band_wavelength" ;
	float planck_bc2 ;
		planck_bc2:long_name = "spectral bandpass correction scale factor for brightness temperature (B(nu) - bc_1)/bc_2 where B()=planck_function() and nu=wavenumber" ;
		planck_bc2:_FillValue = -999.f ;
		planck_bc2:units = "1" ;
		planck_bc2:coordinates = "band_id band_wavelength" ;
	int valid_pixel_count ;
		valid_pixel_count:long_name = "number of good and conditionally usable pixels" ;
		valid_pixel_count:_FillValue = -1 ;
		valid_pixel_count:units = "count" ;
		valid_pixel_count:coordinates = "band_id band_wavelength t y_image x_image" ;
		valid_pixel_count:grid_mapping = "goes_imager_projection" ;
		valid_pixel_count:cell_methods = "t: sum area: sum (interval: 0.000028 rad comment: good and conditionally usable quality pixels only)" ;
	int missing_pixel_count ;
		missing_pixel_count:long_name = "number of missing pixels" ;
		missing_pixel_count:_FillValue = -1 ;
		missing_pixel_count:units = "count" ;
		missing_pixel_count:coordinates = "band_id band_wavelength t y_image x_image" ;
		missing_pixel_count:grid_mapping = "goes_imager_projection" ;
		missing_pixel_count:cell_methods = "t: sum area: sum (interval: 0.000028 rad comment: missing ABI fixed grid pixels only)" ;
	int saturated_pixel_count ;
		saturated_pixel_count:long_name = "number of saturated pixels" ;
		saturated_pixel_count:_FillValue = -1 ;
		saturated_pixel_count:units = "count" ;
		saturated_pixel_count:coordinates = "band_id band_wavelength t y_image x_image" ;
		saturated_pixel_count:grid_mapping = "goes_imager_projection" ;
		saturated_pixel_count:cell_methods = "t: sum area: sum (interval: 0.000028 rad comment: radiometrically saturated geolocated/not missing pixels only)" ;
	int undersaturated_pixel_count ;
		undersaturated_pixel_count:long_name = "number of undersaturated pixels" ;
		undersaturated_pixel_count:_FillValue = -1 ;
		undersaturated_pixel_count:units = "count" ;
		undersaturated_pixel_count:coordinates = "band_id band_wavelength t y_image x_image" ;
		undersaturated_pixel_count:grid_mapping = "goes_imager_projection" ;
		undersaturated_pixel_count:cell_methods = "t: sum area: sum (interval: 0.000028 rad comment: radiometrically undersaturated geolocated/not missing pixels only)" ;
	int focal_plane_temperature_threshold_exceeded_count ;
		focal_plane_temperature_threshold_exceeded_count:long_name = "number of pixels whose temperatures exceeded the threshold" ;
		focal_plane_temperature_threshold_exceeded_count:_FillValue = -1 ;
		focal_plane_temperature_threshold_exceeded_count:units = "count" ;
		focal_plane_temperature_threshold_exceeded_count:coordinates = "band_id band_wavelength t y_image x_image" ;
		focal_plane_temperature_threshold_exceeded_count:grid_mapping = "goes_imager_projection" ;
		focal_plane_temperature_threshold_exceeded_count:cell_methods = "t: sum area: sum (interval: 0.000028 rad comment: temperature exceeded pixels only)" ;
	float min_radiance_value_of_valid_pixels ;
		min_radiance_value_of_valid_pixels:long_name = "minimum radiance value of pixels" ;
		min_radiance_value_of_valid_pixels:standard_name = "toa_outgoing_radiance_per_unit_wavelength" ;
		min_radiance_value_of_valid_pixels:_FillValue = -999.f ;
		min_radiance_value_of_valid_pixels:valid_range = -25.93665f, 804.0361f ;
		min_radiance_value_of_valid_pixels:units = "W m-2 sr-1 um-1" ;
		min_radiance_value_of_valid_pixels:coordinates = "band_id band_wavelength t y_image x_image" ;
		min_radiance_value_of_valid_pixels:grid_mapping = "goes_imager_projection" ;
		min_radiance_value_of_valid_pixels:cell_methods = "t: sum area: minimum (interval: 0.000028 rad comment: good and conditionally usable quality pixels only)" ;
	float max_radiance_value_of_valid_pixels ;
		max_radiance_value_of_valid_pixels:long_name = "maximum radiance value of pixels" ;
		max_radiance_value_of_valid_pixels:standard_name = "toa_outgoing_radiance_per_unit_wavelength" ;
		max_radiance_value_of_valid_pixels:_FillValue = -999.f ;
		max_radiance_value_of_valid_pixels:valid_range = -25.93665f, 804.0361f ;
		max_radiance_value_of_valid_pixels:units = "W m-2 sr-1 um-1" ;
		max_radiance_value_of_valid_pixels:coordinates = "band_id band_wavelength t y_image x_image" ;
		max_radiance_value_of_valid_pixels:grid_mapping = "goes_imager_projection" ;
		max_radiance_value_of_valid_pixels:cell_methods = "t: sum area: maximum (interval: 0.000028 rad comment: good and conditionally usable quality pixels only)" ;
	float mean_radiance_value_of_valid_pixels ;
		mean_radiance_value_of_valid_pixels:long_name = "mean radiance value of pixels" ;
		mean_radiance_value_of_valid_pixels:standard_name = "toa_outgoing_radiance_per_unit_wavelength" ;
		mean_radiance_value_of_valid_pixels:_FillValue = -999.f ;
		mean_radiance_value_of_valid_pixels:valid_range = -25.93665f, 804.0361f ;
		mean_radiance_value_of_valid_pixels:units = "W m-2 sr-1 um-1" ;
		mean_radiance_value_of_valid_pixels:coordinates = "band_id band_wavelength t y_image x_image" ;
		mean_radiance_value_of_valid_pixels:grid_mapping = "goes_imager_projection" ;
		mean_radiance_value_of_valid_pixels:cell_methods = "t: sum area: mean (interval: 0.000028 rad comment: good and conditionally usable quality pixels only)" ;
	float std_dev_radiance_value_of_valid_pixels ;
		std_dev_radiance_value_of_valid_pixels:long_name = "standard deviation of radiance values of pixels" ;
		std_dev_radiance_value_of_valid_pixels:standard_name = "toa_outgoing_radiance_per_unit_wavelength" ;
		std_dev_radiance_value_of_valid_pixels:_FillValue = -999.f ;
		std_dev_radiance_value_of_valid_pixels:units = "W m-2 sr-1 um-1" ;
		std_dev_radiance_value_of_valid_pixels:coordinates = "band_id band_wavelength t y_image x_image" ;
		std_dev_radiance_value_of_valid_pixels:grid_mapping = "goes_imager_projection" ;
		std_dev_radiance_value_of_valid_pixels:cell_methods = "t: sum area: standard_deviation (interval: 0.000028 rad comment: good and conditionally usable quality pixels only)" ;
	float maximum_focal_plane_temperature ;
		maximum_focal_plane_temperature:long_name = "maximum focal plane temperature value" ;
		maximum_focal_plane_temperature:_FillValue = -999.f ;
		maximum_focal_plane_temperature:valid_range = 0.f, 999.f ;
		maximum_focal_plane_temperature:units = "K" ;
	float focal_plane_temperature_threshold_increasing ;
		focal_plane_temperature_threshold_increasing:long_name = "focal plane temperature threshold increasing bounds value" ;
		focal_plane_temperature_threshold_increasing:_FillValue = -999.f ;
		focal_plane_temperature_threshold_increasing:valid_range = 0.f, 999.f ;
		focal_plane_temperature_threshold_increasing:units = "K" ;
	float focal_plane_temperature_threshold_decreasing ;
		focal_plane_temperature_threshold_decreasing:long_name = "focal plane temperature threshold decreasing bounds value" ;
		focal_plane_temperature_threshold_decreasing:_FillValue = -999.f ;
		focal_plane_temperature_threshold_decreasing:valid_range = 0.f, 999.f ;
		focal_plane_temperature_threshold_decreasing:units = "K" ;
	float percent_uncorrectable_L0_errors ;
		percent_uncorrectable_L0_errors:long_name = "percent data lost due to uncorrectable L0 errors" ;
		percent_uncorrectable_L0_errors:_FillValue = -999.f ;
		percent_uncorrectable_L0_errors:valid_range = 0.f, 1.f ;
		percent_uncorrectable_L0_errors:units = "percent" ;
		percent_uncorrectable_L0_errors:coordinates = "t y_image x_image" ;
		percent_uncorrectable_L0_errors:grid_mapping = "goes_imager_projection" ;
		percent_uncorrectable_L0_errors:cell_methods = "t: sum area: sum (uncorrectable L0 errors only)" ;
	float earth_sun_distance_anomaly_in_AU ;
		earth_sun_distance_anomaly_in_AU:long_name = "earth sun distance anomaly in astronomical units" ;
		earth_sun_distance_anomaly_in_AU:_FillValue = -999.f ;
		earth_sun_distance_anomaly_in_AU:units = "ua" ;
		earth_sun_distance_anomaly_in_AU:coordinates = "t" ;
		earth_sun_distance_anomaly_in_AU:cell_methods = "t: mean" ;
	int algorithm_dynamic_input_data_container ;
		algorithm_dynamic_input_data_container:long_name = "container for filenames of dynamic algorithm input data" ;
		algorithm_dynamic_input_data_container:input_ABI_L0_data = "OR_ABI-L0-F-M6_G16_s20220011800205_e20220011809513_c*.nc" ;
	int processing_parm_version_container ;
		processing_parm_version_container:long_name = "container for processing parameter filenames" ;
		processing_parm_version_container:L1b_processing_parm_version = "OR-PARM-RAD_G16_v01r00.zip" ;
	int algorithm_product_version_container ;
		algorithm_product_version_container:long_name = "container for algorithm package filename and product version" ;
		algorithm_product_version_container:algorithm_version = "OR_ABI-L1b-ALG-RAD_v01r00.zip" ;
		algorithm_product_version_container:product_version = "v01r00" ;
	double t_star_look(num_star_looks) ;
		t_star_look:long_name = "J2000 epoch time of star observed in seconds" ;
		t_star_look:standard_name = "time" ;
		t_star_look:units = "seconds since 2000-01-01 12:00:00" ;
		t_star_look:axis = "T" ;
	float band_wavelength_star_look(num_star_looks) ;
		band_wavelength_star_look:long_name = "ABI band central wavelength associated with observed star" ;
		band_wavelength_star_look:standard_name = "sensor_band_central_radiation_wavelength" ;
		band_wavelength_star_look:units = "um" ;
	short star_id(num_star_looks) ;
		star_id:long_name = "ABI star catalog identifier associated with observed star" ;
		star_id:_Unsigned = "true" ;
		star_id:_FillValue = -1s ;
		star_id:coordinates = "band_id band_wavelength_star_look t_star_look" ;
	int channel_integration_time ;
		channel_integration_time:long_name = "Channel-dependent Channel Integration Time, as defined in the VNIR or IR Channel Configuration Table Telemetry" ;
		channel_integration_time:_FillValue = -1 ;
		channel_integration_time:units = "count" ;
	int channel_gain_field ;
		channel_gain_field:long_name = "Channel-dependent Gain Field, as defined in the VNIR or IR Channel Configuration Table Telemetry" ;
		channel_gain_field:_FillValue = -1 ;
		channel_gain_field:units = "1" ;

// global attributes:
		:naming_authority = "gov.nesdis.noaa" ;
		:Conventions = "CF-1.7" ;
		:standard_name_vocabulary = "CF Standard Name Table (v35, 20 July 2016)" ;
		:institution = "DOC/NOAA/NESDIS > U.S. Department of Commerce, National Oceanic and Atmospheric Administration, National Environmental Satellite, Data, and Information Services" ;
		:project = "GOES" ;
		:production_site = "WCDAS" ;
		:production_environment = "OE" ;
		:spatial_resolution = "1km at nadir" ;
		:Metadata_Conventions = "Unidata Dataset Discovery v1.0" ;
		:orbital_slot = "GOES-East" ;
		:platform_ID = "G16" ;
		:instrument_type = "GOES-R Series Advanced Baseline Imager (ABI)" ;
		:scene_id = "Full Disk" ;
		:instrument_ID = "FM1" ;
		:title = "ABI L1b Radiances" ;
		:summary = "Single reflective band ABI L1b Radiance Products are digital maps of outgoing radiance values at the top of the atmosphere for visible and near-IR bands." ;
		:keywords = "SPECTRAL/ENGINEERING > VISIBLE WAVELENGTHS > VISIBLE RADIANCE" ;
		:keywords_vocabulary = "NASA Global Change Master Directory (GCMD) Earth Science Keywords, Version 7.0.0.0.0" ;
		:iso_series_metadata_id = "a70be540-c38b-11e0-962b-0800200c9a66" ;
		:license = "Unclassified data.  Access is restricted to approved users only." ;
		:processing_level = "National Aeronautics and Space Administration (NASA) L1b" ;
		:cdm_data_type = "Image" ;
		:dataset_name = "OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc" ;
		:production_data_source = "Realtime" ;
		:timeline_id = "ABI Mode 6" ;
		:date_created = "2022-01-01T18:09:56.2Z" ;
		:time_coverage_start = "2022-01-01T18:00:20.5Z" ;
		:time_coverage_end = "2022-01-01T18:09:51.3Z" ;
		:LUT_Filenames = "SpaceLookParams(FM1A_CDRL79RevP_PR_09_00_02)-637827000.0.h5 QTableBand01(FM1A_CDRL79RevH_DO_07_00_00)-582860861.0.h5 CalTargetTimeIntervals(FM1A_CDRL79RevP_DO_08_00_01)-611906620.0.h5 BandSaturationLimits(FM1A_CDRL79RevH_DO_08_00_00)-600000000.0.h5 SolarSpaceLookParams(FM1A_CDRL79RevH_DO_09_00_00)-600765435.0.h5 DeadRowListParams(FM1A_CDRL79RevH_DO_08_00_00)-600000000.0.h5 Mirror_Record(FM1A_CDRL79RevG_DO_07_00_00)-582860861.0.h5 KalmanAstroConsts(FM1A_CDRL79RevH_DO_08_00_00)-600000000.0.xml KalmanFilterControls(FM1A_PR_09_08_02)-677650371.0.xml KalmanMeasMaxSensibles(FMAA_INT_ONLY_DO_09_01_00)-652936814.0.xml KalmanPreprocessorControls(FM1A_CDRL79RevJ_PR_09_06_02)-657795700.0.xml KalmanReferenceData(FM1A_CDRL79RevH_DO_08_00_00)-888.0.xml KalmanStarCatalogs(FM1A_CDRL79RevH_DO_08_00_00)-600000000.0.xml ABI_NavigationRDP_Band01(FM1A_CDRL79RevJ_DO_07_00_00)-582860861.0.xml ABI_NavigationParameters_Band01(FM1A_CDRL79RevH_DO_07_00_00)-582860861.0.xml ABI_ResamplingImplementation_Band01(FM1A_CDRL79RevH_DO_07_02_00)-602129336.0.xml ABI_ResamplingParameters_Band01(FM1A_CDRL79RevJ_DO_07_00_00)-582860861.0.xml StarLookParams(FM1A_CDRL79RevH_DO_08_00_00)-600000000.0.h5 StarDetectionParams(FM1A_CDRL79RevJ_DO_07_00_00)-582860861.0.xml ResamplingScaledConversion(FMAA_INT_ONLY_DO_08_00_00)-1111.0.xml BlockReleaseRegions(FMAA_INT_ONLY_DO_08_00_00)-2222.0.csv VNIR_RetrievalParameters(FM1A_CDRL79RevH_DO_08_00_00)-600000000.0.h5 SCT_Record(FM1A_CDRL79RevM_DO_09_00_00)-600765435.0.h5 ICM_ConversionConsts(FM1A_CDRL43-18_DO_09_01_00)-652936750.0.h5 ICM_SensorCoefficients(FM1A_TMABI_18_159_TMABI_18_533_DO_09_05_00)-676949608.0.h5" ;
		:id = "75de858d-c386-4159-a95e-bce8a0d3d61e" ;
}

@dopplershift
Copy link
Member

@DennisHeimbigner the most recent test by @djhoese was on a build without byte-range support, but with S3 support. Would you expect that to work for this case?

@DennisHeimbigner
Copy link

Yes, I just tested that case. The above URL suffixed with '#mode=bytes'
and using a netcdf-c library build with --enable-byterange
works as expected.

@dopplershift
Copy link
Member

Yes, I just tested that case. The above URL suffixed with '#mode=bytes'
and using a netcdf-c library build with --enable-byterange
works as expected.

Above I said @djhoese tested a build WITHOUT --enable-byterange.

@DennisHeimbigner
Copy link

Then I guess you can close this issue.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

Then I guess you can close this issue.

This PR? What? No.

I'm so confused at this point. What is the point of enabling S3 support in netcdf-c? What does it do? Why does it require the AWS C++ library? Does it make no difference for reading data from s3? My assumption is that there would be generally better performance and smarter access to S3 resources.

For byte range support, does that have to be enabled for S3 URIs to work at all? Do I have to use #mode=bytes on an S3 URL even when S3 support is enabled?

I'm pretty sure I've used #mode=bytes on a build that didn't have byte ranges enabled. Does that mean that netcdf-c was downloading the entire file into memory?

@dopplershift if byte ranges are not currently enabled then I'd like to include them in this PR.

@dopplershift
Copy link
Member

@djhoese It is my understanding that the direct S3 support (through the SDK) is only used to implement support for accessing (nc)Zarr data that lives in object storage, due (I think) to the need to access different arbitrary keys. (@DennisHeimbigner @WardF)

If you're accessing a single file, I'm not sure there's anything the direct S3 API offers over the byte-range requests.

So I think the answer to your question about "do I need to enable byte range support?", the answer is yes. As noted above, the way to do that in this PR is add -DENABLE_BYTERANGE=on to build.sh and bld.bat, and I'd be happy to see that since it restores things to the way I had already intended it to be built here.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

So I think the answer to your question about "do I need to enable byte range support?", the answer is yes. As noted above, the way to do that in this PR is add -DENABLE_BYTERANGE=on to build.sh and bld.bat, and I'd be happy to see that since it restores things to the way I had already intended it to be built here.

I and others in the pytroll community have definitely been doing benchmarks assuming this functionality was enabled in this feedstock's build. Any idea @dopplershift if netcdf-c will just download the entire file when #mode=bytes is used and byterange is disabled?

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

@conda-forge-admin please rerender

@dopplershift
Copy link
Member

Not sure. As noted above, though, this was previously enabled on the static library in this feedstock, so it's possible it somehow worked that way?

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2023

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/libnetcdf-feedstock/actions/runs/3849084639.

@WardF
Copy link
Contributor

WardF commented Jan 5, 2023

@dopplershift Thanks for tagging me in, I'm traveling and am on very limited bandwidth, but I'm getting caught up on the convo now.

@WardF
Copy link
Contributor

WardF commented Jan 5, 2023

To summarize/think out loud, it looks like the initial issue was "Why is this S3 request being treated as a DAP request, despite S3 support being compiled in."

Enabling ENABLE_BYTERANGE=ON and adjusting the query to add #mode=bytes at the end of it allows this query to work, because it is treating the remote request as a standard DAP request.

The outstanding question is 'What behavior should be expected if S3 support is on, but BYTERANGE support is turned off?'.

Ok, I'm starting to get a better picture.

@DennisHeimbigner, the following command is being interpreted as a DAP request, when netcdf is built with S3 support, and without BYTERANGE support.

> $ ncdump -h "s3://noaa-goes16/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc"

The question being asked is whether this is what should be happening, or what should they expect to see? I will dig into this, but do you have any ideas off the top of your head?

Also, if I've misunderstood from my initial read-through, please let me know and I'll delete/revise this comment instead of muddying the water.

To answer the question "What happens when #mode=bytes is used and byterange is disabled, a no such file or directory error is thrown.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

To answer the question "What happens when #mode=bytes is used and byterange is disabled, a no such file or directory error is thrown.

If I use a semi-recent conda-forge build of libnetcdf (4.8.1 nompi_h261ec11_106) which I believe does not have byterange enabled (per the above discussion, is there a command line tool to know for sure?) I can use the equivalent HTTPS URL to the above S3 URL and get:

URL + no mode=bytes

$ ncdump -h "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc"
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <?xml^ version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc.dds</Key><RequestId>VBV43GC5MNG93FQ8</RequestId><HostId>tx6ZbhaEr8PuLttI6lMzQgompEBIS56groAXruDpOYglany6puZikinFiHPk4QwStMix4kLrtBs=</HostId></Error>
ncdump: https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc: NetCDF: file not found

URL + mode=bytes

$ ncdump -h "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes"
netcdf OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562 {
dimensions:
        y = 10848 ;
        x = 10848 ;
        number_of_time_bounds = 2 ;
        band = 1 ;
        number_of_image_bounds = 2 ;
        num_star_looks = 24 ;
variables:
        short Rad(y, x) ;
                Rad:_FillValue = 1023s ;
...

@WardF This doesn't seem to match what you got. Thanks for the other info though and you are on the right track with what I'm trying to understand here.

Based on your comment, when is an s3:// URL not treated like a DAP request? I supposed based on @dopplershift's response that may only be when netcdf-c detects it is talking to a Zarr archive of some sort.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

Tests are failing after trying to enable httprange:

test 90
        Start  90: nc_test_test_byterange

90: Test command: /usr/bin/bash "-c" "export srcdir=$SRC_DIR/nc_test;export TOPSRCDIR=$SRC_DIR;$SRC_DIR/build-shared/nc_test/test_byterange.sh test_byterange.sh "
90: Working Directory: $SRC_DIR/build-shared/nc_test
90: Test timeout computed to be: 1500
90: 
90: *** Testing reading NetCDF-3 file with http
90: ***Test remote classic file
58: *** compare  with copy_of_tst_solar_2.cdl
58: *** Test nccopy tst_solar_cmp.nc copy_of_tst_solar_cmp.nc ...
90: $SRC_DIR/build-shared/ncdump/ncdump: https://thredds-test.unidata.ucar.edu/thredds/fileServer/pointData/cf_dsg/example/point.nc#mode=bytes&aws.profile=none: NetCDF: Malformed URL
90: test_http: -k flag mismatch: expected=classic have=
 90/231 Test  #90: nc_test_test_byterange ................***Failed    0.63 sec

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

Ah found Unidata/netcdf-c#2500

@WardF
Copy link
Contributor

WardF commented Jan 5, 2023

@djhoese It's my understanding that byterange was enabled in static conda-forge builds; is it possible to tell if 4.8.1 is static or not? If you can find the libnetcdf.settings file, co-located in the same directory (by default) as the libnetcdf file, you can look and see which options were enabled.

It will be an interesting datapoint; what I observed was generated by testing against the v4.9.1-wellspring.wif branch. If the behavior has changed, that will be another twist to this whole thing.

To answer the question "What happens when #mode=bytes is used and byterange is disabled, a no such file or directory error is thrown.

If I use a semi-recent conda-forge build of libnetcdf (4.8.1 nompi_h261ec11_106) which I believe does not have byterange enabled (per the above discussion, is there a command line tool to know for sure?) I can use the equivalent HTTPS URL to the above S3 URL and get:

URL + no mode=bytes

$ ncdump -h "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc"
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <?xml^ version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc.dds</Key><RequestId>VBV43GC5MNG93FQ8</RequestId><HostId>tx6ZbhaEr8PuLttI6lMzQgompEBIS56groAXruDpOYglany6puZikinFiHPk4QwStMix4kLrtBs=</HostId></Error>
ncdump: https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc: NetCDF: file not found

URL + mode=bytes

$ ncdump -h "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes"
netcdf OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562 {
dimensions:
        y = 10848 ;
        x = 10848 ;
        number_of_time_bounds = 2 ;
        band = 1 ;
        number_of_image_bounds = 2 ;
        num_star_looks = 24 ;
variables:
        short Rad(y, x) ;
                Rad:_FillValue = 1023s ;
...

@WardF This doesn't seem to match what you got. Thanks for the other info though and you are on the right track with what I'm trying to understand here.

Based on your comment, when is an s3:// URL not treated like a DAP request? I supposed based on @dopplershift's response that may only be when netcdf-c detects it is talking to a Zarr archive of some sort.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 5, 2023

Oh very interesting, looks like I have some non-static version that also has Byte-Range Support (libnetcdf.settings):

# NetCDF C Configuration Summary
==============================

# General
-------
NetCDF Version:         4.8.1
Dispatch Version:       3
Configured On:          Mon Oct 31 22:16:41 UTC 2022
Host System:            x86_64-Linux-5.15.0-1022-azure
Build Directory:        /home/conda/feedstock_root/build_artifacts/libnetcdf_1667254369961/work
Install Prefix:         /home/davidh/miniconda3/envs/satpy_py310

# Compiling Options
-----------------
...
Shared Library:         yes
Static Library:         no  <--
Extra libraries:        -lmfhdf -ldf -lhdf5_hl -lhdf5 -lm -lcurl -lzip

# Features
--------
NetCDF-2 API:           yes
HDF4 Support:           yes
HDF5 Support:           yes
NetCDF-4 API:           yes
NC-4 Parallel Support:  no
PnetCDF Support:        no
DAP2 Support:           yes
DAP4 Support:           yes
Byte-Range Support:     yes  <--
Diskless Support:       yes
MMap Support:           yes
JNA Support:            no
CDF5 Support:           yes
ERANGE Fill Support:    yes
Relaxed Boundary Check: yes
SZIP Support:           no
SZIP Write Support:     no
Parallel Filters:       yes
NCZarr Support:         yes
Multi-Filter Support:   yes

@DennisHeimbigner
Copy link

Some comments on the above discussion:

  • S3 support is, as noted, primarily used for Zarr data access.
    Once we had it for Zarr, we (both netcdf and HDF5) extended it
    to byte-range support.

  • There are servers other than S3 that support HTTP byte-range access.
    Thredds is one of them. So #bytes enables access to certain
    Thredds datasets via protocols other than, say, DAP2 or CMDREMOTE.
    Hyrax may also support byte-range access.

  • Since we enable DAP2/DAP4 by default, we can certainly do the same for #bytes.
    I have a vague memory that it was enabled by default at one time; if so, I
    no longer recall why we would have disabled it.
    The only possible issue is that we rely on certain features of HDF5 for this
    for netcdf-4 files. It is possible that old versions of HDF5 do not provide
    the necessary features. I don't know if we can test the HDF5 build for this.

  • WRT downloading files:

    Any idea @dopplershift if netcdf-c will just download
    the entire file when #mode=bytes is used and byterange
    is disabled

    Assuming we are talking about files stored remotely (i.e. S3 or Thredds), the short answer is no, it does not download the file; it will attempt to access it using DAP2 protocol (if enabled) and will fail.

  • WRT S3 URIs:

    ...For byte range support, does that have to be enabled
    for S3 URIs to work at all? Do I have to use #mode=bytes
    on an S3 URL even when S3 support is enabled?...

    The S3 scheme is not necessarily sufficient to tell the library what protocol to use to access the data. It could be byte-range or it could be Zarr. So some marker must be provided in the URL to tell the netcdf-c library what protocol to use.

  • It is for historical reasons that an otherwise unadorned URL is treated as a DAP2 URL. DAP2 was the first remote access protocol implemented in the netcdf-c library. So changing this default would probably break a lot of client code.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 6, 2023

This all sounds good. Thank you for all the info @DennisHeimbigner. So I think we have two cases that are semi-expected given the above discussion, but I'll need to do more testing:

  1. The earlier version of this PR with S3-on and byte-range-off. This got the original errors I mentioned in my early comments.
  2. An older build that is published on conda-forge main that is apparently non-static and has byte-range-on. This is what my last ncdump commands with and without #mode=bytes were using.
  3. The new version of this PR with S3-on and byte-range-on. I need to build this locally and test the URLs I was using before. Either way, I'm only using a NetCDF4 file on S3, not Zarr so the S3 support isn't really being tested here.

The current CI in this PR is failing due to the failing byte range test (see related netcdf-c issue mentioned above). Anybody know of a wait to ignore that failed test? Or do we have to wait for upstream fixes and release?

@djhoese
Copy link
Contributor Author

djhoese commented Jan 6, 2023

I couldn't test the package locally because of the failing byterange test so I made a patch to comment it out, rebuilt the package, and installed it into a test environment. But I keep getting NetCDF: Malformed URL. Maybe I'll just wait for patches upstream.

@DennisHeimbigner
Copy link

But I keep getting NetCDF: Malformed URL.

Can you give some more info about this failure.

@djhoese
Copy link
Contributor Author

djhoese commented Jan 6, 2023

So I built what is currently in this PR, but with an additional pass to disable byte-range tests (because as you know they fail). When I run it locally I get these results:

(libnetcdf-s3-test) davidh@janet:~/repos/git/libnetcdf-feedstock$ ncdump -h "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes"
ncdump: https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes: NetCDF: Malformed URL
(libnetcdf-s3-test) davidh@janet:~/repos/git/libnetcdf-feedstock$ ncdump -h "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc"
ncdump: https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc: NetCDF: Malformed URL

(libnetcdf-s3-test) davidh@janet:~/repos/git/libnetcdf-feedstock$ grep "Byte" ~/miniconda3/envs/libnetcdf-s3-test/lib/libnetcdf.settings
Byte-Range Support:     yes

@djhoese
Copy link
Contributor Author

djhoese commented Jul 6, 2023

This work has moved to #180 now.

@djhoese djhoese closed this Jul 6, 2023
@zklaus
Copy link

zklaus commented Jul 6, 2023

To possibly tie up one loose end here, @djhoese, are you working on Windows? I am asking because byte-range support in this feedstock was enabled in #107 for static builds on Unix and the Windows builds, but not the dynamic builds on Unix, so if you are/were on Windows, that would explain why your dynamic build had byte-range enabled.

In any case, byte-range is now enabled on all builds since #178.

@djhoese
Copy link
Contributor Author

djhoese commented Jul 6, 2023

Nope. Ubuntu/PopOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants