-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of loading used names from persisted Analysis file #995
Conversation
internal/zinc-core/src/main/scala/sbt/internal/inc/MemberRefInvalidator.scala
Outdated
Show resolved
Hide resolved
ddbae9e
to
d53f2e1
Compare
d53f2e1
to
57bde9e
Compare
And as a side-effect of working in the area: make it more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
internal/zinc-core/src/main/scala/sbt/internal/inc/MemberRefInvalidator.scala
Outdated
Show resolved
Hide resolved
This is easier to reason about for Hydra which runs this phase in parallel. Now that we're not creating a full Relation we have some performance budget to spend.
@@ -626,7 +626,7 @@ private final class AnalysisCallback( | |||
private[this] val objectApis = new TrieMap[String, ApiInfo] | |||
private[this] val classPublicNameHashes = new TrieMap[String, Array[NameHash]] | |||
private[this] val objectPublicNameHashes = new TrieMap[String, Array[NameHash]] | |||
private[this] val usedNames = new RelationBuilder[String, UsedName] | |||
private[this] val usedNames = new TrieMap[String, ConcurrentSet[UsedName]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal/zinc-core/src/main/scala/sbt/internal/inc/Incremental.scala
Outdated
Show resolved
Hide resolved
Running this benchmark prior: package sbt.internal.inc
import org.openjdk.jmh.annotations.{
Benchmark,
BenchmarkMode,
Fork,
Measurement,
Mode,
Param,
Scope,
Setup,
State,
Warmup
}
import xsbti.UseScope
import java.io.File
import java.util.concurrent.TimeUnit
@State(Scope.Benchmark)
@BenchmarkMode(Array(Mode.AverageTime))
@Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(value = 3)
class AnalysisSerializationBenchmark {
@Param(Array("/Users/jz/code/scala/target/compiler/zinc/inc_compile.zip"))
var analysisFile: String = _
var firstClassName: String = _
@Setup
def setup(): Unit = {
val store = FileAnalysisStore.binary(new File(analysisFile))
val analysis = store.get().get().getAnalysis.asInstanceOf[Analysis]
firstClassName = analysis.relations.classes._2s.head
}
@Benchmark def deserialize() = {
val store = FileAnalysisStore.binary(new File(analysisFile))
val analysis = store.get().get().getAnalysis.asInstanceOf[Analysis]
val usedNames = analysis.relations.names
val mod = ModifiedNames(
Set(
UsedName.make(
"name_that_does_not_exist_QWERTY",
java.util.EnumSet.allOf[UseScope](classOf[UseScope])
)
)
)
usedNames.forward(firstClassName).iterator.exists(mod.isModified)
}
} And the slightly modified version post: @Benchmark def deserialize() = {
val store = FileAnalysisStore.binary(new File(analysisFile))
val analysis = store.get().get().getAnalysis.asInstanceOf[Analysis]
val usedNames = analysis.relations.names
usedNames.hasAffectedNames(
ModifiedNames(
Set(
UsedName.make(
"name_that_does_not_exist_QWERTY",
java.util.EnumSet.allOf[UseScope](classOf[UseScope])
)
)
),
firstClassName
)
} Shows: Baseline
Post
Post "only intern used names locally ..."
|
Test failure, I think unrelated:
|
I did a little more analysis of variations of used-names interning to convince myself that the new, faster version of is "good enough" for footprint reduction.
private class StringTable {
private val strings = new JHashMap[String, String]()
def lookupOrEnter(string: String): String = {
string.intern() // STRING INTERN, or
string // NO INTERN, or
strings.putIfAbsent(string, string) match { // LOCAL INTERN
case null => string
case v => v
}
}
}
In summary, I think this PR makes the right tradeoff. The compressed protobuf files amount to 9.5M on disk, so we have a 25x higher footprint as Java objects. This is work some more investigation to look see if we can be more compact in memory. |
We can reduce that footprint by 20% by avoiding the message UsedName {
string name = 1;
repeated UseScope scopes = 2;
}
enum UseScope {
DEFAULT = 0;
IMPLICIT = 1;
PATMAT = 2;
} It would be preferable if the bindings automatically represented this with an We can manually do the bitmasking ourselves though: The overhead would also be eliminated if we went ahead with another idea we considered:
That would also get rid of a further 8% overhead of the |
Refs #984