-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistency in handling data read errors in _read_data_from_dfile #549
Comments
Good catch. How did you detect that? I see you checked in revised notebooks but is there a fix to database?
… On Jun 17, 2024, at 11:34 PM, Ian Wang ***@***.***> wrote:
This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.
Read the code below first.
https://github.com/mspass-team/mspass/blob/bf7e2f05609888b56a238095ff2aa3527e304694/python/mspasspy/db/database.py#L4288-L4324
Basically, the data got killed when the endtime doesn't match. The elog message claims that such data will be killed. However, it was later being set alive because the npts is greater than 0, which is mostly true because otherwise it won't be able to calculate the endtime. I can see inside the database the corresponding elog appears to be:
{
"_id":{
"$oid":"6670faebf7b0b388f4b5bcd7"
},
"logdata":[
{
"job_id":{
"$numberInt":"0"
},
"algorithm":"Database._read_data_from_dfile: ",
"badness":"ErrorSeverity.Invalid",
"error_message":"Inconsistent endtimes detected\nEndtime expected from MongoDB document = 2011-04-07T15:17:43.975000Z\nEndtime set by obspy reader = 2011-04-07T15:17:43.975000Z\nEndtime is derived in mspass and should have been repaired - cannot recover this datum so it was killed",
"process_id":{
"$numberInt":"3266"
}
}
],
"data_tag":"serial_preprocessed",
"wf_TimeSeries_id":{
"$oid":"6670f5c8131a86997a8e1d1e"
}
}
Note that this record is from the earthscope2024 notebook here <https://github.com/mspass-team/mspass_tutorial/blob/master/Earthscope2024/Session1.ipynb>.
I think all we need is probably change the wording of the elog message to reflect that the error is potentially harmless. Or, we should refine the checking here such that errors within a certain threshold is acceptable (which is the case above).
—
Reply to this email directly, view it on GitHub <#549>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABNEJ42MP7LKD6HY2GI5VDZH6TD7AVCNFSM6AAAAABJPFF7BCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TQOBQGU2TCMQ>.
You are receiving this because you are subscribed to this thread.
|
I found it by looking at the records in the elog collection in the database. This mismatch error actually happened pretty frequently from that small dataset. Anyway, I don't think this needs immediate attention, but we do need to fix it. I am just opening the issue here as a bookmark. |
I think on the test data set it is related to an error thrown but handled on one of the files. This problem is fundamentally created by the fact we are using obspy's reader to crack miniseed files when running the |
Read the code below first.
mspass/python/mspasspy/db/database.py
Lines 4288 to 4324 in bf7e2f0
Basically, the data got killed when the endtime doesn't match. The elog message claims that such data will be killed. However, it was later being set alive because the
npts
is greater than 0, which is mostly true because otherwise it won't be able to calculate the endtime. I can see inside the database the corresponding elog appears to be:Note that this record is from the earthscope2024 notebook here.
I think all we need is probably change the wording of the elog message to reflect that the error is potentially harmless. Or, we should refine the checking here such that errors within a certain threshold is acceptable (which is the case above).
The text was updated successfully, but these errors were encountered: