Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tree-seq to make iterate-dir lazy #105

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jvoegele
Copy link

@jvoegele jvoegele commented Dec 24, 2016

The previous implementation of iterate-dir was not lazy and would eagerly traverse the entire directory hierarchy and load it all into memory. Because it was eager, the entire directory tree structure would need to be loaded before the iterate-dir function returned. When used on very large directory trees, this could be very slow and could also produce a java.lang.OutOfMemoryError. (See issue #38)

By using tree-seq to lazily traverse the directory tree, the iterate-dir function now immediately returns a lazy sequence (even on very large directory trees) because it does not need to first traverse and load the tree. This also means that iterate-dir will not itself produce an OutOfMemoryError when used on large directory trees. Unfortunately, however, it seems that Clojure's lazy sequence are still susceptible to OutOfMemoryErrors. Even using the lazy tree-seq approach, an OutOfMemoryError can be produced when processing very large directory trees (e.g. with dorun or doseq, or even count):

OutOfMemoryError GC overhead limit exceeded

Nevertheless, this is still an improvement over the previous implementation since the OutOfMemoryError is not produced immediately upon calling iterate-dir, but rather only after processing a very large portion of the results. This seems to be a limitation in Clojure itself, in any case.

@jvoegele
Copy link
Author

@Raynes Any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant