Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requester Pays blocks typical auth mechanism #470

Open
nbren12 opened this issue Apr 7, 2022 · 1 comment
Open

Requester Pays blocks typical auth mechanism #470

nbren12 opened this issue Apr 7, 2022 · 1 comment

Comments

@nbren12
Copy link
Contributor

nbren12 commented Apr 7, 2022

I'm not if sure if this is a "bug" or intentional, but I've had some difficulty when using requester pays mode. Typical google auth doesn't require a default project to be configured in order to access some files, but gcsfs+requester_pays does. See this code:

Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import google.auth
>>> credentials, project = google.auth.default()
>>> credentials, project
(<google.oauth2.credentials.Credentials object at 0x153b971454c0>, None)
>>> import fsspec
>>> len(fsspec.filesystem('gs').ls('vcm-fv3config'))
2
>>> len(fsspec.filesystem('gs', requester_pays=True).ls('vcm-fv3config'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 91, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 71, in sync
    raise return_result
  File "/usr/local/lib/python3.8/dist-packages/fsspec/asyn.py", line 25, in _runner
    result[0] = await coro
  File "/usr/local/lib/python3.8/dist-packages/gcsfs/core.py", line 714, in _ls
    out = await self._list_objects(path, prefix=prefix)
  File "/usr/local/lib/python3.8/dist-packages/gcsfs/core.py", line 484, in _list_objects
    items, prefixes = await self._do_list_objects(path, prefix=prefix)
  File "/usr/local/lib/python3.8/dist-packages/gcsfs/core.py", line 515, in _do_list_objects
    page = await self._call(
  File "/usr/local/lib/python3.8/dist-packages/gcsfs/core.py", line 386, in _call
    status, headers, info, contents = await self._request(
  File "<decorator-gen-2>", line 2, in _request
  File "/usr/local/lib/python3.8/dist-packages/gcsfs/retry.py", line 115, in retry_request
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/gcsfs/core.py", line 363, in _request
    async with self.session.request(
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/client.py", line 1117, in __aenter__
    self._resp = await self._coro
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/client.py", line 492, in _request
    req = self._request_class(
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/client_reqrep.py", line 283, in __init__
    url2 = url.with_query(params)
  File "/usr/local/lib/python3.8/dist-packages/yarl/_url.py", line 983, in with_query
    new_query = self._get_str_query(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/yarl/_url.py", line 944, in _get_str_query
    query = "&".join(self._query_seq_pairs(quoter, query.items()))
  File "/usr/local/lib/python3.8/dist-packages/yarl/_url.py", line 907, in _query_seq_pairs
    yield quoter(key) + "=" + quoter(cls._query_var(val))
  File "/usr/local/lib/python3.8/dist-packages/yarl/_url.py", line 922, in _query_var
    raise TypeError(
TypeError: Invalid variable type: value should be str, int or float, got None of type <class 'NoneType'>

That error message could be more informative :). Would be great if gcsfs would error on building the filesystem if the necessary credentials/project names cannot be inferred.

Related ai2cm/fv3config#143

@zaneselvans
Copy link

As far as I can tell this is the way it's supposed to work (but it surprised me too!) since being authorized to access an object is a different kind of permission from being authorized to spend your organization's money accessing an object, and the "spending" is tracked at the project level in GCP, so just knowing who you are and whether you're allowed to access the object isn't enough information. In addition to providing an explicit project to bill to (that you have permission to spend money on) I think you can set up your GCP configuration to have a default billing project that's used when nothing is specified (just like you can have a default zone or region).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants