-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
location_report_builder getting stuck on get_filedir_count() #417
Comments
Doing a bit more digging, it appears that the generate_status_report task does eventually complete, but this section takes about an hour to complete on our filesystem.
In this case So it's being slow rather than dying, but it's still puzzling why Moodle doesn't seem to think that this task is still running, and continues to run additional instances along with other tasks. |
@marxjohnson did you ever get to the bottom of this? It sounds more like a lock factory problem than an objectfs issue? |
@brendanheywood I didn't get to the bottom of it before I left the OU, and I haven't looked into it since. @sammarshallou Are you still having problems with this? |
@marxjohnson @brendanheywood We still have this task disabled on live server, I guess nobody has needed the status report... I made it run on acct yesterday - it appeared normally in the 'running tasks' page, and although I had to go home before it finished, it did complete after just over 2 hours, and when I look now it's not still showing on the running tasks page or anything like that. The log from task logs is like this: Execute scheduled task: Object status report generator task (tool_objectfs\task\generate_status_report) One strange thing is that we have custom cron logs for each cron runner, which I still use because you can just reload them to monitor progress during a run not after. That log should have a duplicate of this, but it is cut off after the start: Execute scheduled task: Object status report generator task (tool_objectfs\task\generate_status_report) I can't really understand why this log file would get cut off given that the process obviously didn't crash, but anyway, it's presumably something to do with our infrastructure and not indicating any problem with the task. So in summary, it would be nice if the task didn't take 2 hours to run obviously, but other than that it looks OK, |
ok sounds like there is a few things and this issue should be split up. The issue with generate_status_report due to the sql already has a few issues elsewhere like #596 get_filedir_count should be small, certainly not millions of files. This can depend on what the settings are, like if a large threshold is set for the size of files to be moved to object storage. Is this set high? |
The size is default, 10240. I checked a couple of random filedir directories /xx/yy and they both had approx. 20 files in, almost all of which were < that size so I think it's working. So * the 64K directories would give about 1.3 million files total in filedir. So it's not 'millions' but it is a million. When I mentioned this task in a standup, the developer who knows about infrastructure said it was expensive to run frequently as well due to AWS storage costs or something (I'm not sure if he's right, it's possible he might be thinking of a different task, this was just a quick chat) - anyway we are cool with leaving it disabled and running manually only if required, so it's really ok for us that it takes 2 hours. |
We have noticed that our cron is running lots of instances of the generate_status_report scheduled task, that never seem to complete.
Doing some digging, I have found that location_report_builder reaches the stage where it runs $filesystem->get_filedir_count(), which runs the following shell command:
find /srv/learn2syst.open.ac.uk/www/moodledata/filedir -type f | grep -c /
For some reason, the task hangs at this point and never completes. There is no error output. Our contianer running the cron script remains active, so it obviously doesn't think the cron script is complete.
More strange is that Moodle does seem to think the scheduled task is complete, it continues to run additional scheduled tasks, including further instances of generate_status_report. Watching runningtasks.php shows the task run for about 3-4 minutes, then disappear.
The text was updated successfully, but these errors were encountered: