Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 poll file_time_limit doesn't work properly #1180

Open
andreleblanc11 opened this issue Aug 22, 2024 · 4 comments
Open

v2 poll file_time_limit doesn't work properly #1180

andreleblanc11 opened this issue Aug 22, 2024 · 4 comments
Assignees
Labels
bug Something isn't working likely-fixed likely fix is in the repository, success not confirmed yet. Priority 5 - Defect Missing and/or broken functionality - do not forget v2only only affects v2 branches.

Comments

@andreleblanc11
Copy link
Member

andreleblanc11 commented Aug 22, 2024

We've found a problem when we're starting/stopping polls between sarracenia versions (2.21.04 and 2.24.08). The ls file in the poll cache doesn't recognize the new entries, before it gets the VIP back, so it will repost all the old files in the ls entry -> #1184.

To mitigate this problem, we've found that we can specify a 'file age limit' within the poll, with file_time_limit.

sarracenia/sarra/sr_poll.py

Lines 177 to 189 in bd095d8

try:
line_split = ls[f].split()
date = line_split[5] + " " + line_split[6]
file_date = parse(date)
current_date = datetime.datetime.now()
file_within_date_limit = abs((file_date - current_date).seconds) < self.file_time_limit
except:
self.logger.error("Assuming ok, incorrect date format for line: %s" % ls[f])
if not dateparser_available:
self.logger.error("need to install dateparser module, perhspa via: pip install 'sarracenia[ftppoll]'" )
if file_within_date_limit:
self.logger.debug("File should be processed")

When testing this config option on dev, I found that the .seconds method ONLY returns the seconds from the CURRENT DAY and not the total number of seconds. This means that

  • If you have file_time_limit Xd (default is 60d) it should always return True and never filter out old files.
  • If you have file_time_limit Xh or Xm or Xs the files will be ignored for some parts of the day, but then unignored once the timestamp is reached again on the following day.
    • We currently don't have any configuration specifying the file_time_limit so this is the reason why it has never caused problems until now
>>> current_date
datetime.datetime(2024, 8, 22, 14, 51, 47, 640140)
>>> file_date
datetime.datetime(2024, 8, 15, 12, 40)
>>> abs((file_date - current_date)).days
7                                                                    <- The days field is separated
>>> abs((file_date - current_date)).seconds
7907                                                                 <- The seconds field is separated

The fix for this should be to pass .total_seconds() instead of .seconds to get the actual total number of seconds.

>>> abs((file_date - current_date)).days
7
>>> abs((file_date - current_date)).seconds
7907
>>> abs((file_date - current_date)).total_seconds()
612707.64014
@andreleblanc11 andreleblanc11 added bug Something isn't working Priority 5 - Defect Missing and/or broken functionality - do not forget v2only only affects v2 branches. labels Aug 22, 2024
@andreleblanc11 andreleblanc11 self-assigned this Aug 22, 2024
@andreleblanc11
Copy link
Member Author

There are currently no operational v2 polls that have file_time_limit specified in a configuration, so there should be no impact.

@andreleblanc11
Copy link
Member Author

andreleblanc11 commented Aug 22, 2024

Another note.. the logic should be current_date - file_date and we should remove the abs(). There's no reason why the file_date should be older then the current_date. I guess it doesn't hurt to keep the abs() though.

@reidsunderland
Copy link
Member

reidsunderland commented Aug 22, 2024

There might be weird timezone cases where the file_date is newer than the current_date. In that case I think we'd want to accept the file and the abs still isn't needed, because a negative number is < any file_time_limit

>>> file_time_limit = 600
>>> later = datetime.datetime.now()
>>> later
datetime.datetime(2024, 8, 22, 15, 46, 41, 868412)
>>> earlier
datetime.datetime(2024, 8, 22, 15, 45, 11, 608764)
>>> (earlier - later).total_seconds()
-90.259648
>>> (earlier - later).total_seconds() < file_time_limit
True
>>>

Now I wonder if we could have a fileAgeMin in v2 by setting a negative file_age_limit? (Once the bug is fixed) :D

>>> current_time = later
>>> file_time = earlier
>>> file_time_limit = -60 # files that are less than 60 seconds old are too new, ignore
>>> (current_time - file_time).total_seconds() < file_time_limit
False
>>>

@petersilva
Copy link
Contributor

I'm running tests on a PR.

@petersilva petersilva added the likely-fixed likely fix is in the repository, success not confirmed yet. label Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working likely-fixed likely fix is in the repository, success not confirmed yet. Priority 5 - Defect Missing and/or broken functionality - do not forget v2only only affects v2 branches.
Projects
None yet
Development

No branches or pull requests

3 participants