-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Greedy String Tiling out of bounds #2179
base: develop
Are you sure you want to change the base?
Conversation
… racing conditions
|
* @param options determines the parameterization. | ||
* @deprecated in favor of static {@link #run(JPlagOptions)}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did this move down due to spotless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I'm pretty sure I didn't touch the Javadoc. But I think this is the canonical order.
* @return the results of the comparison, specifically the submissions whose similarity exceeds a set threshold. | ||
* @throws ExitException if JPlag exits preemptively. | ||
* @deprecated in favor of static {@link #run(JPlagOptions)}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
return this.tokenValueMap.get(submission); | ||
} | ||
|
||
public static TokenValueMapper generateTokenValueMapper(SubmissionSet submissionSet) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a special reason to have this generator method and not use the constrcutor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of. Usually constructors should not contain application logic, at least the way I learned it. So having a constructor create the entire map automatically feels like it's bad code.
Moved the token value mapping outside greedy string tiling to prevent racing conditions
The out of bounds exception that occasionally occurs is presumably caused by calculation the token values in multiple threads and using non-thread-safe methods. This PR solves that problem by calculating the lists before starting greedy string tiling.
I took some measurements and the total time for JPlag seems to be unaffected
On the progpedia data I got The following measurement:
old: 2.053 s ± 0.014 s
new: 2.080 s ± 0.016 s
Using 100 copies of JPlag core code as input:
old: 13.006 s ± 0.201 s
new: 12.997 s ± 0.150 s