Skip to content

Store anything anywhere. anystore provides a high-level storage and retrieval interface for various supported store backends, such as redis, sql, file, http, cloud-storages and anything else supported by fsspec.

License

Notifications You must be signed in to change notification settings

investigativedata/anystore

Repository files navigation

anystore on pypi Python test and package pre-commit Coverage Status GPL-3.0 License

anystore

Store anything anywhere. anystore provides a high-level storage and retrieval interface for various supported store backends, such as redis, sql, file, http, cloud-storages and anything else supported by fsspec.

Think of it as a key -> value store, and anystore acts as a cache backend. And when keys become filenames and values become byte blobs, anystore becomes actually a file-like storage backend – but always with the same and interchangeable interface.

Why?

In our several data engineering projects we always wrote boilerplate code that handles the featureset of anystore but not in a reusable way. This library shall be a stable foundation for data wrangling related python projects.

Examples

Base cli interface:

anystore -i ./local/foo.txt -o s3://mybucket/other.txt

echo "hello" | anystore -o sftp://user:password@host:/tmp/world.txt

anystore -i https://investigativedata.io > index.html

anystore --store sqlite:///db keys <prefix>

anystore --store redis://localhost put foo "bar"

anystore --store redis://localhost get foo  # -> "bar"

Use in your applications:

from anystore import smart_read, smart_write

data = smart_read("s3://mybucket/data.txt")
smart_write(".local/data", data)

Simple cache example via decorator:

Use case: @anycache is used for api view cache in ftmq-api

from anystore import get_store, anycache

cache = get_store("redis://localhost")

@anycache(store=cache, key_func=lambda q: f"api/list/{q.make_key()}", ttl=60)
def get_list_view(q: Query) -> Response:
    result = ... # complex computing will be cached
    return result

Mirror file collections:

from anystore import get_store

source = get_store("https://example.org/documents/archive1")  # directory listing
target = get_store("s3://mybucket/files", backend_config={"client_kwargs": {
    "aws_access_key_id": "my-key",
    "aws_secret_access_key": "***",
    "endpoint_url": "https://s3.local"
}})  # can be configured via ENV as well

for path in source.iterate_keys():
    # streaming copy:
    with source.open(path) as i:
        with target.open(path, "wb") as o:
            i.write(o.read())

Documentation

Find the docs at docs.investigraph.dev/lib/anystore

Used by

Development

This package is using poetry for packaging and dependencies management, so first install it.

Clone this repository to a local destination.

Within the repo directory, run

poetry install --with dev

This installs a few development dependencies, including pre-commit which needs to be registered:

poetry run pre-commit install

Before creating a commit, this checks for correct code formatting (isort, black) and some other useful stuff (see: .pre-commit-config.yaml)

testing

anystore uses pytest as the testing framework.

make test

About

Store anything anywhere. anystore provides a high-level storage and retrieval interface for various supported store backends, such as redis, sql, file, http, cloud-storages and anything else supported by fsspec.

Resources

License

Stars

Watchers

Forks

Packages

No packages published