index_waveforms: should have option to skip certain files by regex or fnmatch #89

megies · 2018-08-23T11:13:46Z

Some digitizers store loads of state-of-health data alongside the waveforms and in data archives they tend to end up in the same directory trees as the waveforms. But (at least so far) we never were interested in querying SOH data from jane (e.g. via FDSNWS), so in practice they just bloat the DB and slow down other processes we are interested in actually.

I do not see any options to skip certain files when scanning a directory tree, but there should definitely be one or the other.

Ideally, I think this should be handled via a new DB table that can be modified in the admin panel (like the waveform SEED ID mappings). Another option would be to introduce CLI options to manage.py index_waveforms (which would be more flexible, as it could be changed per indexer process, but on the other hand that would be more verbose, as those ignore-patterns would have to be added to every indexer-spawning shell script).

The text was updated successfully, but these errors were encountered:

megies mentioned this issue Aug 23, 2018

Bug in processing traces with sampling_rate 0 (e.g. log files) #90

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index_waveforms: should have option to skip certain files by regex or fnmatch #89

index_waveforms: should have option to skip certain files by regex or fnmatch #89

megies commented Aug 23, 2018

index_waveforms: should have option to skip certain files by regex or fnmatch #89

index_waveforms: should have option to skip certain files by regex or fnmatch #89

Comments

megies commented Aug 23, 2018