dir_archive implementation issues #54

aathan · 2017-06-17T00:04:08Z

If, for example, dir_archive is used with flattened keys the str() encoding of those tuples yields directory names with parenthesis, and these ultimately do not correctly load if the archive is written non-pickled (i.e., as python objects) because invalid characters are used in the import statement that is exec()ed. There also seem to be some hacks relative to reloading the archive from disk, in particular

_getkey contains [2:] vs being based on the value of PREFIX
the interactions between _getdir _getkey and _lookup make various assumptions which I believe frustrate causing _fname to meaningfully modify the text representation of keys as "good" filenames
_lookup in particular does not distinguish between calls made to it where the key parameter is coming from a directory name vs really being a key (sequence is _keydict()-->_getkey()-->_lookup()-->_getdir() ), implying the assumed equivalence of dir and key encodings.

I've fixed some problems in a branch, e.g., by adding a parameter to _lookup(...,isdir=False) allowing me to implement a filename encoding in _fname which eliminates problematic characters. This yields the ability to have relatively clear-text directory names, and python object storage that works. I.e., a disk cache that is easily understood by human eyes. This makes the cache useful as a backing store for, for example, function values used to replay behavior in testing frameworks. E.g., run the program once with nothing in the dir_archive, then run it again from a full dir_archive to regression test the parts that rely on the functions that got cached.

I can submit a pull request, but see 0 pull requests here, so I'm wondering if you're accepting community input here.

... I'm also wondering why _hasinput() doesn't use os.path.isfile().

mmckerns · 2017-07-05T22:52:14Z

@aathan: I absolutely do welcome PRs. I just have not had any on klepto as of yet. Please feel free to be the first. I tend to like to break big PRs up into smaller multiple PRs, with one idea per PR... that way they are easier to review and understand the impact of. Anyway, it sounds like you've made some good potential changes. I admittedly have some hacks in klepto, and some things that I am unsatisfied with. I feel the package is a good start, but needs some TLC to fix some of the little issues, such as those you mention.

mmckerns added the refactor label Jul 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dir_archive implementation issues #54

dir_archive implementation issues #54

aathan commented Jun 17, 2017 •

edited

Loading

mmckerns commented Jul 5, 2017

dir_archive implementation issues #54

dir_archive implementation issues #54

Comments

aathan commented Jun 17, 2017 • edited Loading

mmckerns commented Jul 5, 2017

aathan commented Jun 17, 2017 •

edited

Loading