-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable listings cache for HTTP filesystem #560
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding! Just a small question.
|
||
|
||
def test_list_cache_with_expiry_time(server): | ||
h = fsspec.filesystem("http", use_listings_cache=True, listings_expiry_time=30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could test these three parametrised cases to actually trigger the condition: have a short listing expiry time and sleep past it; list more than 5 paths; make a new instance with/without the instance cache (one should already have its dircache populated, the other not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Martin,
thank you very much for your suggestions, I tried to implement some ideas from them as good as possible. However, maybe I am still missing some details. If you see the tests should be improved once more, I am humbly asking for your guidance and to elaborate a bit more about the details I might have missed.
have a short listing expiry time and sleep past it
I just added b1f4cad in order to bring in your idea of verifying the actual contents of the directory cache beyond the expiration time.
make a new instance with/without the instance cache (one should already have its dircache populated, the other not)
Do you mean something along the lines of ca0a61e?
list more than 5 paths
I am not sure I am following here. The test is currently using the some response mock as the other tests are also using. Is there some mock which would yield more than 5 paths? Would it actually make any difference?
With kind regards,
Andreas.
(this PR should be easy to finish off, if you have a little time) |
|
61c14eb
to
2cd6946
Compare
2cd6946
to
ca0a61e
Compare
ca0a61e
to
5c7377e
Compare
Hi Martin,
Thank you. Finally, I have been able to catch up on this. Please let me know about further suggestions for this patch. On a side: On my workstation, I observed different hiccups where the test suite would stall completely when invoking With kind regards, [1] https://github.com/intake/filesystem_spec/actions/runs/780684958 |
You seem to have managed to hit an async deadlock in this code. I don't see why, but this may be good news - since deadlocks have occasionally occurred elsewhere, and this will give us the ability to investigate. |
I cannot push to your branch, but earthobservations#1 appears to fix the problem. |
Dear Martin,
we recognized your activity on the code base again and feared we would have missed some merge window already. So, we tried to hurry to bring in this patch I had slumbering on my workstation for some time already.
The patch will enable listings caching for the HTTP filesystem implementation. It is accompanied by appropriate tests and I will be happy if you could take the time to look into it and consider it for the next release if you like it. Please let us know about any amendments you would like to see.
With kind regards,
Andreas.
P.S.: After @gutzbenj submitted #507 the other day, this patch would be the last thing we would need for finally getting rid of the dbmfile-based cache we still use within Wetterdienst, see earthobservations/wetterdienst#243. Because of this, we haven't migrated the code base to
filesystem_spec
yet, but would be very keen doing so.On the CI side, as the old cache implementation is apparently not thread-safe (earthobservations/wetterdienst#254 (comment)), this would hopefully enable us to run all tests in parallel using pytest-xdist, see earthobservations/wetterdienst#253. At the same time, it would be a tremendous improvement for our users and also for ourselves, as we would be able to remove some code from the code base and stand on the shoulders of
filesystem_spec
.