Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

index_waveforms: should have option to skip certain files by regex or fnmatch #89

Open
megies opened this issue Aug 23, 2018 · 0 comments

Comments

@megies
Copy link
Collaborator

megies commented Aug 23, 2018

Some digitizers store loads of state-of-health data alongside the waveforms and in data archives they tend to end up in the same directory trees as the waveforms. But (at least so far) we never were interested in querying SOH data from jane (e.g. via FDSNWS), so in practice they just bloat the DB and slow down other processes we are interested in actually.

I do not see any options to skip certain files when scanning a directory tree, but there should definitely be one or the other.

Ideally, I think this should be handled via a new DB table that can be modified in the admin panel (like the waveform SEED ID mappings). Another option would be to introduce CLI options to manage.py index_waveforms (which would be more flexible, as it could be changed per indexer process, but on the other hand that would be more verbose, as those ignore-patterns would have to be added to every indexer-spawning shell script).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant