Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cached filename used after network errors #12

Open
Zaitsev opened this issue Jun 19, 2019 · 2 comments
Open

cached filename used after network errors #12

Zaitsev opened this issue Jun 19, 2019 · 2 comments

Comments

@Zaitsev
Copy link

Zaitsev commented Jun 19, 2019

If network error occurred during querying remote geoTiff (no s3 creds needed), marble-virtual uses cache instead of querying file again

server start: (0.1 timeout for raise Curl timeout errors)
GDAL_HTTP_TIMEOUT=0.1 CPL_CURL_VERBOSE=YES python ./server.py

try to get twice
curl http://localhost:8000/tiles/12/1183/1468?url=https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff&rgb=4,5,6 > /dev/null

on the first query I see Curl requests, and good exception

2019-06-19 14:15:02,778.778 ERROR virtual.web 5412 : CURL error: Connection time-out| web.py:64

on any further queries no Curl requests made, instead marble-virtual tries to open cached file

2019-06-19 14:15:06,050.050 ERROR virtual.web 8684 : '/vsicurl/https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff' does not exist in the file system, and is not recognized as a supported dataset name.| web.py:64

I've traced it to magblecutter/init.py

def get_source(path):
    """Cached source opening."""
    with rasterio.Env():
        return rasterio.open(path)

Is there any method to reset caches?
I've tried
gdal.VSIErrorReset()
gdal.VSICurlClearCache()
with no success, seems rasterio cached something somewhere

log :

 * Serving Flask app "marblecutter-virtual" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
2019-06-19 14:14:57,294.294  INFO     werkzeug   528        :  * Running on http://0.0.0.0:8000/ (Press CTRL+C to quit)| _internal.py:122
2019-06-19 14:14:57,295.295  INFO     werkzeug   529        :  * Restarting with stat| _internal.py:122
2019-06-19 14:14:57,897.897  WARNING  werkzeug   531        :  * Debugger is active!| _internal.py:122
2019-06-19 14:14:57,898.898  INFO     werkzeug   532        :  * Debugger PIN: 888-447-794| _internal.py:122
2019-06-19 14:15:02,539.539  DEBUG    virtual.web 5173       : ------- WEBPY https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff -----------------------------------| web.py:46
2019-06-19 14:15:02,544.544  DEBUG    rasterio._base 5178       : Sharing flag: 32| __init__.py:216
* Couldn't find host f.tiles.utility.opt-net.eu in the .netrc file; using defaults
* Connection time-out
* Closing connection 0
* Couldn't find host f.tiles.utility.opt-net.eu in the .netrc file; using defaults
* Hostname f.tiles.utility.opt-net.eu was found in DNS cache
*   Trying 185.178.87.64...
* TCP_NODELAY set
* Connected to f.tiles.utility.opt-net.eu (185.178.87.64) port 443 (#1)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* Operation timed out after 100 milliseconds with 0 out of 0 bytes received
* stopped the pause stream!
* Closing connection 1
None
2019-06-19 14:15:02,778.778  ERROR    virtual.web 5412       : ============2019-06-19 14:15:02.778293| web.py:60
2019-06-19 14:15:02,778.778  ERROR    virtual.web 5412       : CURL error: Connection time-out| web.py:64
Traceback (most recent call last):
  File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__
  File "rasterio/_shim.pyx", line 64, in rasterio._shim.open_dataset
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_HttpResponseError: CURL error: Connection time-out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/c/VLZ/oDesk/taras/src/IBM/kepler/kepler-tiles/marblecutter-virtual/virtual/web.py", line 55, in make_catalog
    resample=args.get("resample"),
  File "/c/VLZ/oDesk/taras/src/IBM/kepler/kepler-tiles/marblecutter-virtual/virtual/catalogs.py", line 42, in __init__
    with get_source(self._uri) as src:
  File "/home/vlz/miniconda3/envs/marble/lib/python3.6/site-packages/marblecutter/__init__.py", line 149, in get_source
    return rasterio.open(path)
  File "/home/vlz/miniconda3/envs/marble/lib/python3.6/site-packages/rasterio/env.py", line 430, in wrapper
    return f(*args, **kwds)
  File "/home/vlz/miniconda3/envs/marble/lib/python3.6/site-packages/rasterio/__init__.py", line 216, in open
    s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
  File "rasterio/_base.pyx", line 218, in rasterio._base.DatasetBase.__init__
rasterio.errors.RasterioIOError: CURL error: Connection time-out
2019-06-19 14:15:02,786.786  INFO     werkzeug   5420       : 127.0.0.1 - - [19/Jun/2019 14:15:02] "GET /tiles/12/1183/1468?url=https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff&rgb=4,5,6& HTTP/1.1" 404 -| _internal.py:122
2019-06-19 14:15:06,047.047  DEBUG    virtual.web 8681       : ------- WEBPY https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff -----------------------------------| web.py:46
2019-06-19 14:15:06,048.048  DEBUG    rasterio._base 8682       : Sharing flag: 32| __init__.py:216
None
2019-06-19 14:15:06,049.049  ERROR    virtual.web 8683       : ============2019-06-19 14:15:06.049726| web.py:60
2019-06-19 14:15:06,050.050  ERROR    virtual.web 8684       : '/vsicurl/https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff' does not exist in the file system, and is not recognized as a supported dataset name.| web.py:64
Traceback (most recent call last):
  File "rasterio/_base.pyx", line 216, in rasterio._base.DatasetBase.__init__
  File "rasterio/_shim.pyx", line 64, in rasterio._shim.open_dataset
  File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
rasterio._err.CPLE_OpenFailedError: '/vsicurl/https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff' does not exist in the file system, and is not recognized as a supported dataset name.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/c/VLZ/oDesk/taras/src/IBM/kepler/kepler-tiles/marblecutter-virtual/virtual/web.py", line 55, in make_catalog
    resample=args.get("resample"),
  File "/c/VLZ/oDesk/taras/src/IBM/kepler/kepler-tiles/marblecutter-virtual/virtual/catalogs.py", line 42, in __init__
    with get_source(self._uri) as src:
  File "/home/vlz/miniconda3/envs/marble/lib/python3.6/site-packages/marblecutter/__init__.py", line 149, in get_source
    return rasterio.open(path)
  File "/home/vlz/miniconda3/envs/marble/lib/python3.6/site-packages/rasterio/env.py", line 430, in wrapper
    return f(*args, **kwds)
  File "/home/vlz/miniconda3/envs/marble/lib/python3.6/site-packages/rasterio/__init__.py", line 216, in open
    s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
  File "rasterio/_base.pyx", line 218, in rasterio._base.DatasetBase.__init__
rasterio.errors.RasterioIOError: '/vsicurl/https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff' does not exist in the file system, and is not recognized as a supported dataset name.
2019-06-19 14:15:06,059.059  INFO     werkzeug   8693       : 127.0.0.1 - - [19/Jun/2019 14:15:06] "GET /tiles/12/1183/1468?url=https://f.tiles.utility.opt-net.eu/tiles/canada-26May2019-15May2019-1/canada-26May2019-15May2019-1-LZW.tiff&rgb=4,5,6& HTTP/1.1" 404 -| _internal.py:122
@mojodna
Copy link
Owner

mojodna commented Jun 21, 2019

This may be related to https://rasterio.groups.io/g/main/topic/31734579, but I think your intuition is correct on this being incorrectly cached within marblecutter (it should be invalidated if any exceptions are encountered while reading, but I'm not sure how yet).

@Zaitsev
Copy link
Author

Zaitsev commented Jun 22, 2019

following links I finished with OSGeo/gdal#1244 (comment) solution
Good news: setting CPL_VSIL_CURL_NON_CACHED="/vsicurl/" fixes issue
Bad news: geoTiff parsed on each tile request so each next tile request time it the same as the first tile request, with cache it at least 2 times faster
Searched rasterio codes for cache to find a way to reset cache with no success....
I've tried to change query-string and add &time= but it doesn't help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants