You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the current setup, if a workset collection is accessed more than twice, it will still be cached by the cache call insertion optimization.
Example
for (_ <-0 until 5) {
valcands=for {
x <- comps
e <- edges
y <- comps
if x.id == e.src
if y.id == e.dst
} yieldLVertex(y.id, Math.min(x.label, y.label))
comps =for {
Group(id, cs) <- cands.groupBy(_.id)
} yieldLVertex(id, cs.map(_.label).min)
}
This is translated to (approximately)
FlinkNtv.iterate(comps)(comps => {
valcands=for {
x <- comps
e <- edges
y <- comps
if x.id == e.src
if y.id == e.dst
} yieldLVertex(y.id, Math.min(x.label, y.label))
for {
Group(id, cs) <- cands.groupBy(_.id)
} yieldLVertex(id, cs.map(_.label).min)
})
In the rewritten version, the lambda passed to FlinkNtv.iterate has a parameter comps which is accessed twice the lambda body. Because of that, a subsequent addCacheCalls transformation will insert a FlinkOps.cache(comps) call at the beginning of the lambda body.
Suggested Solution
As a first approximation, we should exclude parameter caching for lambdas passed to a FlinkNtv.iterate operator.
The text was updated successfully, but these errors were encountered:
With the current setup, if a workset collection is accessed more than twice, it will still be cached by the cache call insertion optimization.
Example
This is translated to (approximately)
In the rewritten version, the lambda passed to
FlinkNtv.iterate
has a parametercomps
which is accessed twice the lambda body. Because of that, a subsequentaddCacheCalls
transformation will insert aFlinkOps.cache(comps)
call at the beginning of the lambda body.Suggested Solution
As a first approximation, we should exclude parameter caching for lambdas passed to a
FlinkNtv.iterate
operator.The text was updated successfully, but these errors were encountered: