-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add base class for caches in _cache.py #82
Comments
|
Yes, the cache classes need to be refactored |
Joining in. Is it a good idea to enable the caching classes (current, new, or refactored) to have the optional behavior of caching everything to files? or does it already enable that? I spent some time in the docs and source but didn't get to locate that option. In case not ― let me explain the rationale: For functions of long computation time, it's often the case that one wants to reuse their results in new runs. At least in research workflows. Abstracting their output being cached to files is a major workflow enabler for continuity in dynamically evolving experiment workflows. Currently I'm using my own caching function which either computes or loads from a file, with its default behavior best described as "load from file if already available". However, klepto does a good job at matching function calls to the set of bound argument values to enable caching per bound function argument's set. It even enables specifying arguments to ignore in the keying so that arguments that aren't supposed to be affecting the function return value don't generate a different caching key for the function. "load from file if already available" would just be very enabling, if not already enabled. Or I'll just use https://klepto.readthedocs.io/en/latest/#klepto.keygen in my own function, if my workflow isn't as generically useful to other people. Maybe this is a little like "purge cache to archive at maxsize" mentioned in the documentation of e.g. inf_cache, but I'm not sure how caches and archives interrelate in the klepto user facing api. |
I am mentioning it here as it may interrelate with refactoring the various classes, beyond the angle of conciseness or simplification of the facade which this thread seems to have started regarding. |
@matanster: you can select from any of the archive backends (a file, or files, included), and they can either be coupled to a in-memory cache that can be synchronized with the backend, or you can use the archive as the cache directly. |
Thanks and sorry for the long message before. Currently, I think that for the rest of us having only my amount of intelligence, code examples are the main shortage for getting started on using a library like klepto. The docstring of inf_cache is very clear and coherent, and I gather that all high-level cache objects (inf_cache, lru_cache, etc) use class cache which is inheriting python's dict ― as a lean proxy to a storage backend (a.k.a. a kelpto archiver) if provided one upon initialization. For a non-dummy archiver that persists the data somewhere, we need to pass one when initializing the high-level cache object, for example: my_cache = inf_cache(cache=file_archive) That's all. So that's how we choose a cache strategy (from among The cache strategy ― the cascade providing the function results from either a (1) cached value from memory (2) an archive backend such as a file archive or directory archive or (3) a recomputation is easily gleaned in the high level classes' code, for exmaple in the case of Lines 308 to 324 in e8fc4f6
As user code calls my_cache(my_fn, ...) If you like, I can update the readme to give these two lines of code and a very concise form of the above explanation, to give a usage example there. |
Maybe you could rename As the tests are passing for me (python 3.10), I can probably make the readme more readily convey usage, towards making klepto slightly more popular as its own package with just a handful of small changes, and take a step on improving the internal structure. If any of the two make a difference for you, let me know and I'll allocate some time. As much as you trust the test suite to provide fair safety for a small refactor having only a linux machine to test on. If however pathos is directly calling or referencing the internals underneath the level of the high-level |
Feel free to submit a PR... Unfortunately, I believe most of the examples in this tutorial are pulled from |
There's a lot of similarity between the various cache classes. Is there any reason not to create a base class and derive the various cache classes from it? Some methods that appear to be copy/paste include:
archive
key
lookup
__get_cache
__get_mask
__get__
__reduce__
The text was updated successfully, but these errors were encountered: