Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for interactive problems #261

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

Holmes98
Copy link
Member

@Holmes98 Holmes98 commented Jan 28, 2024

Depends on #260. Based on the Communication task type from CMS.

This adds an interactive_processes field to the Evaluator class.

If interactive_processes=0 (default), functionality is unchanged (evaluator runs after the user/submission program).

If interactive_processes>0, we concurrently spawn that many user processes, each connected using pipes to the manager process. After all processes exit, execution stats are merged (time/memory is added, wall time is maximised, first failing status is used).

  • Unlike normal evaluators, interactive evaluators (which I'm referring to as the "manager" to match CMS's terminology) must always terminate gracefully, otherwise it is judged as an evaluator error (see here for relevant discussion).
    • Need to handle SIGPIPE when writing to the user pipes, in case they crashed/exited early, etc.
    • It is possible for the processes to deadlock if both attempt to read input at the same time (e.g. on forgetting to flush output, or just user error, example); in this case the user process will be terminated first by isolate on reaching the wall time limit, then the manager should read EOF.
  • There is currently some extra handling to make MLE/TLE report correctly after merging stats; it feels a bit hacky but it seemed like the easiest approach (happy to take suggestions though).
  • If interactive_processes>1, the process number is sent as a command line argument to each user program, but it can also be sent by the manager if preferred.
  • I'm using file descriptors 5+ for the user pipes, fd 4 for "expected output", and stdin for the case input. fd 3 is currently unused; it can be considered "reserved" in case we want to add more functionality later (e.g. for "actual output" that can be shown to the user for sample cases, or for staff to debug). I'm also considering swapping file descriptors stdin/3 for non-interactive evaluators (currently 3 => case input, stdin => actual output) so they match, but have left it for now (will require manually updating existing evaluators).
  • interactive_processes is currently limited to 2 to avoid exhausting server resources.

Notes on resource limits:

  • According to CMS, the manager's wall time limit should be greater than the sum of wall time limits of all user processes, and the manager's time limit should be greater than the sum of time limits of all user processes.
  • Each user process is allowed to execute for the total time limit, as the constraint on total time can only be enforced after all processes have terminated.
  • The wall time limit for user processes is lowered to match CMS. This reduces the time taken to judge submissions in the deadlock scenario mentioned above.

@Holmes98
Copy link
Member Author

I've marked this as a draft but am looking for feedback if anyone is able to review.

@coveralls
Copy link

coveralls commented Jan 28, 2024

Coverage Status

coverage: 38.75% (+1.2%) from 37.523%
when pulling 5894a2b on interactive-problems
into 51571c5 on master.

@Holmes98 Holmes98 force-pushed the interactive-problems branch 2 times, most recently from 49f169f to 3fdecfc Compare January 28, 2024 05:48
Base automatically changed from compiled-evaluators to master May 19, 2024 06:44
@BelgianSalamander
Copy link
Member

I have tested this on IOI 2018 Combo and IOI 2020 Stations. No glaring issues so far. I will try to review soon.

Copy link
Member

@BelgianSalamander BelgianSalamander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems mostly correct overall.

I like the idea of fd 3 being used for "actual output" as for some problems it definitely makes sense to show some extra feedback to users on sample cases (as currently the only feedback is the testcase outcome).

While testing this with the IOI problems, I found it difficult to debug evaluators and submissions as it can usually be unclear if the problem lies in the evaluator or the submission. I feel like it could be useful to have some way of showing the stderr of the user processes to staff to make it easier when making problems as otherwise there is no way to get any info directly from the user processes.

Comment on lines +363 to +364
r['meta']['cg-mem'] = [r['meta']['cg-mem'],memory_limit*1024].min if r['meta'].has_key?('cg-mem')
r['meta']['max-rss'] = [r['meta']['max-rss'],memory_limit*1024].min
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small issue: The stored memory usage should probably get capped like this regardless of whether the previous status is OK. (r['meta']['status'] == 'OK' on line 358).

Otherwise, in some cases the memory usage displayed on a submission's page may be much higher than the memory limit, since this code won't be run if one of the program instances causes some other error (RE, TLE), on top of exceeding the combined memory limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants