You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@asmacdo showed interest to participate in ongoing handbook hackathon and I thought that it might be great to have a use case show case for dandisets (super dataset at https://github.com/dandi/dandisets, individual at https://github.com/dandisets, asyncio code to update those from the archive within the tools/ of dandisets) and https://github.com/datalad/datalad-fuse/ extension. Dandisets are "special" in that typical files are there large but for access to metadata etc, only small portion of data is needed to be accessed. In datalad-fuse we use https://github.com/fsspec/filesystem_spec/ with local caching, to provide efficient sparse access to remote annexed files which have an http* url associated with them.
In datalad core we had a request for streaming datalad/datalad#4003 -- so it might be useful to highlight how streaming could be implemented, via fsspec interface within datalad-fuse or directly via FUSE filesystem of that one.
WDYT datalad-handbook folks about such a section? (attn @adswa@mih)
The text was updated successfully, but these errors were encountered:
Running dandi validate to validate one of the dandisets, show how much is downloaded (du -scm .git/datalad/cache or whatever that path is) from the total size of files which are validated.
Just ping me if you need any infos. You should add a new file in docs/usecases and place it somewhere in the docs/usecases/intro.rst toctree. Usecases do not need to have code that is executed and captured, so you can go with .. code-block::s instead of .. runrecords::. Looking forward to it!
@asmacdo showed interest to participate in ongoing handbook hackathon and I thought that it might be great to have a use case show case for dandisets (super dataset at https://github.com/dandi/dandisets, individual at https://github.com/dandisets, asyncio code to update those from the archive within the
tools/
of dandisets) and https://github.com/datalad/datalad-fuse/ extension. Dandisets are "special" in that typical files are there large but for access to metadata etc, only small portion of data is needed to be accessed. In datalad-fuse we use https://github.com/fsspec/filesystem_spec/ with local caching, to provide efficient sparse access to remote annexed files which have an http*url
associated with them.In datalad core we had a request for streaming datalad/datalad#4003 -- so it might be useful to highlight how streaming could be implemented, via fsspec interface within datalad-fuse or directly via FUSE filesystem of that one.
WDYT datalad-handbook folks about such a section? (attn @adswa @mih)
The text was updated successfully, but these errors were encountered: