-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make inverting the call tree fast, by computing inverted call nodes lazily #4900
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4900 +/- ##
==========================================
+ Coverage 85.93% 86.04% +0.11%
==========================================
Files 312 312
Lines 29773 30271 +498
Branches 8197 8275 +78
==========================================
+ Hits 25585 26047 +462
- Misses 3597 3631 +34
- Partials 591 593 +2 ☔ View full report in Codecov by Sentry. |
8fd4e74
to
533a704
Compare
22b07b4
to
b730acb
Compare
This is now ready for review! A few more comments are probably needed in the "Create inverted call nodes lazily" commit, but I think it's reviewable in its current state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
submitted my current comments! Hopefully they still make sense, but I think they do :-)
src/types/profile-derived.js
Outdated
* Tree Left aligned Right aligned Reordered by suffix | ||
* - [cn0] A = A = A [so0] [so0] [cn0] A | ||
* - [cn1] B = A -> B = A -> B [so3] [so1] [cn4] A <- A | ||
* - [cn2] A = A -> B -> A = A -> B -> A [so2] ↘↗ [so2] [cn2] A <- B <- A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A difference with the previous situation, is that previously, in the inverted call tree, we'd have both [A, B] and [A, B, A], despite that [B, A] doesn't exist in the uninverted call tree.
Now several inverted call nodes (in this case [A, B] and [A, B, A]) could map to the same non-inverted call node. I guess the non-inverted call node maps to its deeper version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A difference with the previous situation, is that previously, in the inverted call tree, we'd have both [A, B] and [A, B, A], despite that [B, A] doesn't exist in the uninverted call tree. Now several inverted call nodes (in this case [A, B] and [A, B, A]) could map to the same non-inverted call node.
That's right. Later on in this comment, it says that both in2
and in3
map to cn2
; that's exactly your example.
I guess the non-inverted call node maps to its deeper version?
There is not really any mapping of non-inverted call nodes to inverted call nodes. You usually don't ever need to go in that direction. The only time when we ask "which inverted call node corresponds to this non-inverted call node?" is when we the user toggles inverted mode and we try to move the selection to a place that makes sense. That's done in findHeavyPathToSameFunctionAfterInversion
. And for the non-inverted node [A, B, A], this function will pick the inverted node [A]. I think that makes sense - you clicked the deepest node (an actual "leaf" node in this case), and, after inverting, leaf nodes become roots.
5cc7d34
to
21d53c7
Compare
I've now split the incremental order refinement out into a separate commit at the end. I've also polished both split commits to be easier to read. Both the "Create inverted call nodes lazily" and the "Refine the suffix order incrementally, as new inverted nodes are created" commit should be much easier to review now. |
21d53c7
to
0e7998b
Compare
This is the main commit of this PR. Now that nothing is relying on having an inverted call node for each sample, or on having a fully-computed inverted call node table, we can make it so that we only add entries to the inverted call node table when we actually need a node, for example because it was revealed in the call tree. This makes it a lot faster to click the "Invert call stack" checkbox - before this commit, we were computing a lot of inverted call nodes that were never shown to the user. After this commit, CallNodeInfoInvertedImpl no longer inherits from CallNodeInfoImpl - it is now a fully separate implementation. This commit reduces the time spent in `getInvertedCallNodeInfo` on an example profile (https://share.firefox.dev/411Vg2T) from 11 seconds to 718 ms. Before: https://share.firefox.dev/3CTNApp After: https://share.firefox.dev/492F7wl (15x faster)
The new structure gives us a nice guarantee about roots of the inverted tree: There is an inverted root for every func, and their indexes are identical. This makes it really cheap to translate between the call node index and the func index (no conversion or lookup is necessary) and also makes it cheap to check if a node is a root. This commit also replaces a few maps and sets with typed arrays for performance. This is easier now that the root indexes are all contiguous.
This avoids a CompareIC when comparing to null in _createInvertedRootCallNodeTable, because we'll now only be comparing integers. This speeds up _createInvertedRootCallNodeTable by almost 2x.
…ted. This saves a lot of work upfront that's not needed. At any given time, we just need the suffix order to be accurate enough so that the "suffix order index range" for every existing inverted call node is correct. This commit reduces the time spent in `getInvertedCallNodeInfo` + `getChildren` on an example profile (https://share.firefox.dev/411Vg2T) from 721 ms to 20 ms. Before: https://share.firefox.dev/40Wdi6S After: https://share.firefox.dev/3AZjbpg (35x faster)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to "Remove now-unused getCallNodeTable()." included. (all good so far)
I'm pushing my comments so far, I'm not sure I'll be able to do more today (but who knows)
src/types/profile-derived.js
Outdated
* | ||
* ## Definition | ||
* | ||
* We define the suffix order as the lexicographic order of the inverted call path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth precising that it's also depth-first like the non-inverted table (this is also explained below in call paths become grouped in such a way that call paths which belong to the same *inverted* tree node
but because we write "depth-first" right above, I think it's good to mention it here as well, so that a reader doesn't wonder "oh, I guess the suffix order is different" but it's not).
(if I'm not mistaken :-) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, "depth-first" is something that describes a tree traversal, i.e. an ordering of nodes. I don't think we can use the term to describe an ordering of paths. If you want to say every path corresponds to one inverted node, then the list would be "skipping" some nodes. For example, here's a suffix-ordered list of call paths: A, A-A, C-B-A, B, C-B, C, B-D. If you make an inverted tree for this list, then that tree needs two nodes for which there aren't any exact call path matches in the list, specifically B-A and D. So it's not really a depth-first traversal of the inverted tree because it's missing some ancestor nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm that's true.
I think I wanted to mean we have the ancestors before their children (when present)
const tracedTimingNonInverted = | ||
tracedTiming !== null && tracedTiming.type === 'NON_INVERTED' | ||
? tracedTiming.timings | ||
: null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent too much time trying to understand why we have null
for the INVERTED
case... until I realized that this can't happen here. Can you please be more specific at the start of the comment, saying something like that (suggestion) "When the flame graph panel is displayed, all tree computations are non inverted -- see UrlState.getInvertCallstack"
@@ -504,6 +500,9 @@ class CallNodeContextMenuImpl extends React.PureComponent<Props> { | |||
const fileName = | |||
filePath && | |||
parseFileNameFromSymbolication(filePath).path.match(/[^\\/]+$/)?.[0]; | |||
|
|||
const callNodeTable = callNodeInfo.getNonInvertedCallNodeTable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self, this change could have been in the previous commit (use the non inverted table for the recursive function transform)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm only missing the commits " Create inverted call nodes lazily" and "Refine the suffix order incrementally, as new inverted nodes are created", I reviewed the others.
Pushing the comments I have so far!
type InvertedNonRootCallNodeTable = {| | ||
prefix: InvertedCallNodeHandle[], | ||
func: IndexIntoFuncTable[], // IndexIntoInvertedNonRootCallNodeTable -> IndexIntoFuncTable | ||
pathHash: string[], // IndexIntoInvertedNonRootCallNodeTable -> string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah interesting and good idea that we store it here, we might as well store it for the other tables too? The only drawback is that the strings might be large with a lot of duplication if the tree is deep.
(not in this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only drawback is that the strings might be large with a lot of duplication if the tree is deep.
Although that maybe by constructing them iteratively (parent + new_function), maybe the engine shares the string parts (as dependent strings?) between the strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, hard to say. There's a good chance that the string starts out as a rope when it's created, like you say. But then we use this string as a key in a map, and the lookup will flatten the rope and atomize the flattened string. So in the end I'm not sure what will end up getting stored.
src/profile-logic/call-node-info.js
Outdated
|
||
// Non-null for non-root nodes whose children haven't been created yet. | ||
// For a non-root node x of the inverted tree, let k = depth[x] its depth in the inverted tree, | ||
// and deepNodes = deepNodes[x] be its non-null deep nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's confusing that you use the same name deepNodes
in left and right. Then below (line 327) it's not clear to me what deepNodes[x]
refers to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh indeed, that's confusing. I wonder if I had different names here in the past and this got the way it is by search & replace.
src/profile-logic/call-node-info.js
Outdated
// For a non-root node x of the inverted tree, let k = depth[x] its depth in the inverted tree, | ||
// and deepNodes = deepNodes[x] be its non-null deep nodes. | ||
// Then, for every index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x], | ||
// the k'th prefix node of suffixOrderedCallNodes[i] is stored at deepNodes[x][i - suffixOrderIndexRangeStart[x]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need an example, it's not clear to me what these deepNodes
are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an example in the long comment above export class CallNodeInvertedImpl
, search for | self node | corresponding deep node |
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I've just read it! It's a lot clearer now.
Maybe this comment here would be better as a reference to the comment above then?
Something such as:
The array at index k caches the children for the inverted node k, up to some depth.
This is useful to quickly compute the children for this inverted node.
Afterwards it's set to null for that index, but passed on to its children.
Please refer to the comment on CallNodeInvertedImpl below for a more detailed explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The array at index k caches the children for the inverted node k,
Although this sentence here is wrong.
Is it more:
Let's take <arr>
the array at index <in>
for inverted node <in>
.
<arr>
contains the <depth>
th ancestor for all paths that "ends with" this inverted node <in>
, that is all its children (no matter their respective depth).
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe add something such as "as a consequence, the deepnodes are ordered by call node suffix order"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it to the following:
// Non-null for non-root nodes whose children haven't been created yet.
// The array at index x caches ancestors of the non-inverted nodes belonging
// to the inverted node x, specifically the ancestor "k steps up" from each
// non-inverted node, with k being the depth of the inverted node.
// This is useful to quickly compute the children for this inverted node.
// Afterwards it's set to null for that index, and the next level is passed on
// to the newly-created children.
// Please refer to 'Why do we keep a "deepNodes" property in the inverted table?'
// in the comment on CallNodeInvertedImpl below for a more detailed explanation.
//
// For every inverted non-root call node x with deepNodes[x] !== null:
// For every suffix order index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x],
// the k'th parent node of suffixOrderedCallNodes[i] is stored at
// deepNodes[x][i - suffixOrderIndexRangeStart[x]], with k = depth[x].
deepNodes: Array<Uint32Array | null>, // IndexIntoInvertedNonRootCallNodeTable -> (Uint32Array | null)
Also maybe add something such as "as a consequence, the deepnodes are ordered by call node suffix order"?
Hmm, I haven't thought about whether that's true. It might well be true!
return this._invertedNonRootCallNodeTable.pathHash[nonRootIndex]; | ||
} | ||
|
||
prefixForNode( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefixForNode doesn't seem covered by tests, but I think this shows a lack of prior tests for inverted trees rather than a problem in this patch. This is called by CallTree.getParent, which is called when the user presses ArrowLeft in the tree.
I think this is out of scope for this patch to add coverage for this.
} else if (suffixOrderIndex >= selectedSubtreeRangeEnd) { | ||
sampleSelectedState = 'UNSELECTED_ORDERED_AFTER_SELECTED'; | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we also have no test covering inverted trees with samples filtered out...
(out of scope for the PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I couldn't go as far as I wanted, but at least I've read the long comment now, which is pretty good, thanks a lot for that! I had a few comments on it to maybe improve it, tell me what you think.
src/types/profile-derived.js
Outdated
* - [in3] A (so:1..2) = A <- A = ... A -> A (cn4) | ||
* - [in4] B (so:2..3) = A <- B = ... B -> A (cn2) | ||
* - [in6] A (so:2..3) = A <- B <- A = ... A -> B -> A (cn2) | ||
* - [in1] B (so:3..5) = B = ... B (cn1, cn5) | ||
* - [in5] A (so:3..5) = B <- A = ... A -> B (cn1, cn5) | ||
* - [in6] A (so:4..5) = B <- A <- A = ... A -> A -> B (cn5) | ||
* - [in7] C (so:5..7) = C = ... C (cn6, cn3) | ||
* - [in8] A (so:5..6) = C <- A = ... A -> C (cn6) | ||
* - [in9] B (so:6..7) = C <- B = ... B -> C (cn3) | ||
* - [in10] A (so:6..7) = C <- B <- A = ... A -> B -> C (cn3) | ||
* - [in10] A (so:4..5) = B <- A <- A = ... A -> A -> B (cn5) | ||
* - [in2] C (so:5..7) = C = ... C (cn6, cn3) | ||
* - [in7] A (so:5..6) = C <- A = ... A -> C (cn6) | ||
* - [in8] B (so:6..7) = C <- B = ... B -> C (cn3) | ||
* - [in9] A (so:6..7) = C <- B <- A = ... A -> B -> C (cn3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the counter are a little bit "random" (except for the roots), depending on when they've been accessed for the first time, aren't they?
It would be good to mention that in line 474 above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes they are. I'll add a comment to mention that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to:
* inX: Inverted call node index X (this index is somewhat arbitrary because
* it's based on the order in which callNodeInfoInverted.getChildren is
* called)
src/profile-logic/call-node-info.js
Outdated
* What this example shows is that we need to look not at a self node's immediate | ||
* parent, but rather at its (k + 1)'th parent, where k is the depth of the | ||
* inverted node whose children we're creating. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced :D why do we need to look at this? Is that to know if the kth inverted node has children?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just that, when we need to look at the children of in4, in4 being depth 1, its children will be depth 2, so we need to look 2 parents up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's exactly it. Ok I'll try to change it to say 2 first, and then do the generic k
depth in the next paragraph. Anyway, what I'm trying to get across is that, if you only keep the self nodes around, then you would need to go really far up the non-inverted parents to know the functions for the inverted children. I'm trying to make the point that we also need to keep the deep nodes around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok sounds good, maybe you can write something like that as an introduction: In the next few paragraphs, we'll explain why we need to constantly iterate over the non-inverted parents, and that the "deepNodes" property is a cache to make it faster. Keep reading!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
src/profile-logic/call-node-info.js
Outdated
* | ||
* in4 has one self node: cn2. This is the only non-inverted node whose call path | ||
* ends in "... -> B -> A". | ||
* | ||
* in4 has depth 1. | ||
* | ||
* in4's func is B. | ||
* cn2's func is A. (!) | ||
* | ||
* cn2's func still corresponds to the inverted root, i.e. in0's func. | ||
* But cn2's parent, cn1, has func B. | ||
* | ||
* And cn1's parent, cn0, has func A. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why we need to enter into so much details here.
Isn't it quite clear that in4 has one child, with func A, because cn2 has one parent, which is A? I'm sure I'm missing some complexity...
Is that to detail how many steps we need to go up from cn2's self func to find the children?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that's a good point, it really looks rather obvious if you put it like that. Yes, the purpose was to show that you need to go up one level from the self node if you want to know the function of the inverted child. I can try to shorten this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one key is to talk about iterations too.
Iteration 0: compute children of root
Iteration 1: compute children of one of these children
Iteration 2: compute children of one of the children of iteration 1.
And mention that these iterations are not done right away but only on demand from a user action (for example the user expands a row in the inverted call tree, or clicks somewhere in the activity graph). This way this links the algorithm with what happens in the app and makes things more concrete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea with the iterations! Added.
src/profile-logic/call-node-info.js
Outdated
* cn2's 0th parent is cn2. | ||
* cn2's 1st parent is cn1. | ||
* cn2's 2nd parent (i.e. its grandparent) is cn0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* cn2's 0th parent is cn2. | |
* cn2's 1st parent is cn1. | |
* cn2's 2nd parent (i.e. its grandparent) is cn0. | |
* cn2's 0th parent is cn2 (func: A). | |
* cn2's 1st parent is cn1 (func: B). | |
* cn2's 2nd parent (i.e. its grandparent) is cn0 (func: A). |
Note: I find this more concise and clearer than the write-up above! Maybe the write-up above could just be that too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I've changed it to consistently have (func: X)
behind most nodes.
* So in5 has no children. | ||
* | ||
* --- | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a title such as Why do we keep a "deepNodes" property in the inverted table
so that we can grep it easily (and follow where you're going)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great idea!
src/profile-logic/call-node-info.js
Outdated
// For a non-root node x of the inverted tree, let k = depth[x] its depth in the inverted tree, | ||
// and deepNodes = deepNodes[x] be its non-null deep nodes. | ||
// Then, for every index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x], | ||
// the k'th prefix node of suffixOrderedCallNodes[i] is stored at deepNodes[x][i - suffixOrderIndexRangeStart[x]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I've just read it! It's a lot clearer now.
Maybe this comment here would be better as a reference to the comment above then?
Something such as:
The array at index k caches the children for the inverted node k, up to some depth.
This is useful to quickly compute the children for this inverted node.
Afterwards it's set to null for that index, but passed on to its children.
Please refer to the comment on CallNodeInvertedImpl below for a more detailed explanation.
// and deepNodes = deepNodes[x] be its non-null deep nodes. | ||
// Then, for every index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x], | ||
// the k'th prefix node of suffixOrderedCallNodes[i] is stored at deepNodes[x][i - suffixOrderIndexRangeStart[x]]. | ||
deepNodes: Array<Uint32Array | null>, // IndexIntoInvertedNonRootCallNodeTable -> (Uint32Array | null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this could be deepNodesCache
, to better refer that it's used as an optimizing trick... but up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not asking to change the name now, this can be done in a follow-up if we feel it's better)
Another idea is that instead of the terminology of "deep node", it could be "ancestor node". Indeed what I struggled with is that the values in the Uint32Array are call node indexes in the non inverted call tree, they do not represent call nodes in the inverted call tree. But the term "deep" implies to me that they are children of this inverted call node (they're not!).
But I'm not sure of that so it might be good to wait (also I'd prefer to see this landed rather nitpicking on the exact terms)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the name deepNodesCache
, happy to rename it to this in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about the term "ancestor node" too but I was worried that it would be to easy to think it was referring to an ancestor in the inverted tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still haven't finished but I think I'm more and more confident about understanding this fully. I'm now looking at the meat of this commit (_computeChildrenInfo and _createChildrenForInfo).
I'm sorry this is so long.
I'm pushing the current comments so that you can see them early.
while (currentHandle >= rootCount) { | ||
const nonRootIndex = currentHandle - rootCount; | ||
callNodePath.push(this._invertedNonRootCallNodeTable.func[nonRootIndex]); | ||
currentHandle = this._invertedNonRootCallNodeTable.prefix[nonRootIndex]; | ||
} | ||
const rootFunc = currentHandle; | ||
callNodePath.push(rootFunc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: I would have used a do/while loop here, so that the rootFunc would be pushed inside the loop
(also there are too few occasions to use do/while so I try to use it when I can :D poor little underused loop control instruction)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what that would look like to be honest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was confused, I forgot about the existence of the 2 separate tables for root vs non root nodes, sorry.
src/profile-logic/call-node-info.js
Outdated
// Returns a CallNodeIndex from a CallNodePath. | ||
getCallNodeIndexFromPath( | ||
callNodePath: CallNodePath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please precise if this is an inverted call node path? (I believe it is?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, whenever the caller has a CallNodeInfoInverted
in their hands, the caller will be dealing with inverted call node handles and inverted call node paths. Clarified the comment.
src/profile-logic/call-node-info.js
Outdated
return childNodeHandle; | ||
} | ||
|
||
_findDeepestKnownAncestor(callPath: CallNodePath): InvertedCallNodeHandle { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function seems to be packed with small surprising things.
My understanding is that it is essentially bisecting, right?
It would be good to mention it in a comment for the function, so that we can follow along more easily.
Also what does "known" means in this context? Is it one that's already created it the call node table (and therefore one that's in the cache)?
The function seems to be used only in getCallNodePathFromIndex
it could be better that it's located closer to that function below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: "deepest anscestor" is a term that's difficult to understand quickly IMO, it's kind of an oxymoron... Rationally I understand of course, it's "deepest" in the inverted table, but "ancestor" in the non-inverted table. I don't have a better idea at the moment though. But it could be good to mention that in the comment to the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh haha you're right, it is packed with surprises!
I've renamed _findDeepestKnownAncestor
to _findDeepestExistingInvertedAncestorNode
and moved it to just below its caller.
I'm also adding the following comment:
// Find the depth of the deepest existing ancestor node using bisection.
// For each tested depth `currentDepth`, we create a partial inverted call
// path `callPath.slice(0, currentDepth + 1)` and check whether we have an
// inverted node for that partial call path. We find the largest value of
// `currentDepth` for which the partial call path refers to an existing node,
// and set `bestNode` to that node.
And no, it finds an ancestor node in the inverted tree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And no, it finds an ancestor node in the inverted tree!
Ah yeah, your comments further down show that you understood this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I'm not so sure anymore. I'd need to think about it again to ask a good question, but I don't have the time atm :) let's discuss this at some point later.
src/profile-logic/call-node-info.js
Outdated
const parentDeepNodeCount = parentDeepNodes.length; | ||
const [parentIndexRangeStart, parentIndexRangeEnd] = | ||
this.getSuffixOrderIndexRangeForCallNode(parentNodeHandle); | ||
if (parentIndexRangeStart + parentDeepNodeCount !== parentIndexRangeEnd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was gonna write: I'm not fully sure I understand this invariant, can you please explain?
but now I think I understand that by construction, the amount of deepNodes should be equal to the amount of children.
Maybe it would be good to write it as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW do we want to keep this condition explicitely, or would you rather have it as a comment assert
like in other places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure what to do about this. I left the explicit checks in. I don't think they affect performance much.
src/profile-logic/call-node-info.js
Outdated
* | ||
* in4 has one self node: cn2. This is the only non-inverted node whose call path | ||
* ends in "... -> B -> A". | ||
* | ||
* in4 has depth 1. | ||
* | ||
* in4's func is B. | ||
* cn2's func is A. (!) | ||
* | ||
* cn2's func still corresponds to the inverted root, i.e. in0's func. | ||
* But cn2's parent, cn1, has func B. | ||
* | ||
* And cn1's parent, cn0, has func A. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one key is to talk about iterations too.
Iteration 0: compute children of root
Iteration 1: compute children of one of these children
Iteration 2: compute children of one of the children of iteration 1.
And mention that these iterations are not done right away but only on demand from a user action (for example the user expands a row in the inverted call tree, or clicks somewhere in the activity graph). This way this links the algorithm with what happens in the app and makes things more concrete.
src/profile-logic/call-node-info.js
Outdated
// For a non-root node x of the inverted tree, let k = depth[x] its depth in the inverted tree, | ||
// and deepNodes = deepNodes[x] be its non-null deep nodes. | ||
// Then, for every index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x], | ||
// the k'th prefix node of suffixOrderedCallNodes[i] is stored at deepNodes[x][i - suffixOrderIndexRangeStart[x]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The array at index k caches the children for the inverted node k,
Although this sentence here is wrong.
Is it more:
Let's take <arr>
the array at index <in>
for inverted node <in>
.
<arr>
contains the <depth>
th ancestor for all paths that "ends with" this inverted node <in>
, that is all its children (no matter their respective depth).
?
src/profile-logic/call-node-info.js
Outdated
// For a non-root node x of the inverted tree, let k = depth[x] its depth in the inverted tree, | ||
// and deepNodes = deepNodes[x] be its non-null deep nodes. | ||
// Then, for every index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x], | ||
// the k'th prefix node of suffixOrderedCallNodes[i] is stored at deepNodes[x][i - suffixOrderIndexRangeStart[x]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe add something such as "as a consequence, the deepnodes are ordered by call node suffix order"?
src/profile-logic/call-node-info.js
Outdated
let nodesWhichEndHereCount = 0; | ||
while (nodesWhichEndHereCount < parentDeepNodeCount) { | ||
const deepNode = parentDeepNodes[nodesWhichEndHereCount]; | ||
if (callNodeTable.prefix[deepNode] !== -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a comment such as "This is the first deepNode that has a parent => all the following ones will have a parent as well."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review finished for Create inverted call nodes lazily.
next up is Refine the suffix order incrementally, as new inverted nodes are crea…
(the last one)
* - [cn4] A = A -> A = A -> A [so1] | ||
* - [cn5] B = A -> A -> B = A -> A -> B [so4] | ||
* - [cn6] C = A -> C = A -> C [so5] | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When reviewing, it helped me looking at an actual inverted tree in the UI and poking at the data structures such as the deepNodes.
I think it would be useful to show full inverted tree here. It would be the introduction to the algorithm that constructs it ("this is the our expected result, now let's look at how we build it" in a way)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done!
src/profile-logic/call-node-info.js
Outdated
const parentDeepNodeCount = parentDeepNodes.length; | ||
const [parentIndexRangeStart, parentIndexRangeEnd] = | ||
this.getSuffixOrderIndexRangeForCallNode(parentNodeHandle); | ||
if (parentIndexRangeStart + parentDeepNodeCount !== parentIndexRangeEnd) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW do we want to keep this condition explicitely, or would you rather have it as a comment assert
like in other places?
src/profile-logic/call-node-info.js
Outdated
const firstChildFirstParentDeepNode = | ||
parentDeepNodes[nodesWhichEndHereCount]; | ||
const firstChildFirstDeepNode = | ||
callNodeTable.prefix[firstChildFirstParentDeepNode]; | ||
childrenDeepNodes[0] = firstChildFirstDeepNode; | ||
const firstChildFunc = callNodeTable.func[firstChildFirstDeepNode]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without looking too close, I wonder why this can't be part of the loop below?
(but I see this code is removed later anyway so let's skip this comment)
nextChildSuffixOrderIndexRangeStart += deepNodeCount; | ||
nextChildDeepNodeIndex += deepNodeCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(optional) if you move these additions at the end of the loop, I think this could be somewhat clearer (it's common that the end of a loop contains the operations for the next iteration), also I think you could rename them currentChildXXX
instead of next
, and use them directly (so you could remove suffixOrderIndexRangeStart
too).
src/profile-logic/call-node-info.js
Outdated
* For all i in 0..deepNodes.length, deepNodes[i] is the k'th parent node | ||
* of suffixOrderedCallNodes[suffixOrderIndexRangeStart + i], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* For all i in 0..deepNodes.length, deepNodes[i] is the k'th parent node | |
* of suffixOrderedCallNodes[suffixOrderIndexRangeStart + i], | |
* For all i in 0..deepNodes.length, deepNodes[i] is the k'th parent node | |
* of suffixOrderedCallNodes[suffixOrderIndexRangeStart + i] in the non-inverted tree, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit I haven't done as much scrutiny for the "refine order" commit than the others, but it looks very reasonable still.
Let's land this after the small comment changes are done!
Thanks a ton for the algorithm update, I could test with some of my old profiles that were also pretty slow (deploy preview / production) and they improve immensely.
if ( | ||
nodesWhichEndHereCount + childrenDeepNodeCount !== | ||
parentDeepNodeCount | ||
) { | ||
throw new Error('indexes out of sync'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to keep this assertion in the final code, or would you rather have it as a comment?
src/profile-logic/call-node-info.js
Outdated
return null; | ||
} | ||
|
||
const childrenDeepNodeCount = parentDeepNodeCount - nodesWhichEndHereCount; | ||
// We create one child for each distinct func we found. The children need to | ||
// be ordered by func. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why exactly does it need to be ordered by func? (we easily see it is ordered by func with the sort
operation, but not clearly why). Is that because it needs to be suffix-ordered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to "We order the children by func so that _getChildWithFunc can use bisection."
src/profile-logic/call-node-info.js
Outdated
// We have the parent's self nodes and their corresponding deep nodes. | ||
// These nodes are currently only sorted up to the parent's depth: | ||
// we know that every parentDeepNode has the parent's func. | ||
// But if we look at the prefix of each parentDoopNode, we'll encounter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
// But if we look at the prefix of each parentDoopNode, we'll encounter | |
// But if we look at the prefix of each parentDeepNode, we'll encounter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doop node LOL
// and deepNodes = deepNodes[x] be its non-null deep nodes. | ||
// Then, for every index i in suffixOrderIndexRangeStart[x]..suffixOrderIndexRangeEnd[x], | ||
// the k'th prefix node of suffixOrderedCallNodes[i] is stored at deepNodes[x][i - suffixOrderIndexRangeStart[x]]. | ||
deepNodes: Array<Uint32Array | null>, // IndexIntoInvertedNonRootCallNodeTable -> (Uint32Array | null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not asking to change the name now, this can be done in a follow-up if we feel it's better)
Another idea is that instead of the terminology of "deep node", it could be "ancestor node". Indeed what I struggled with is that the values in the Uint32Array are call node indexes in the non inverted call tree, they do not represent call nodes in the inverted call tree. But the term "deep" implies to me that they are children of this inverted call node (they're not!).
But I'm not sure of that so it might be good to wait (also I'd prefer to see this landed rather nitpicking on the exact terms)
… in the flame graph, it'll be non-inverted traced timing.
…er is somewhat arbitrary) as requested in julien's review
…erted call node paths.
…torNode and improve its comments.
2d573dd
to
c80f34a
Compare
## Updates: [Nicolas Chevobbe] Make timeline ruler notches visible in High Contrast Mode (#5346) [Nazım Can Altınova] Add the ability to mark marker fields as hidden (#5354) [Maxx Crawford] Update guide-startup-shutdown.md (#5357) [Nazım Can Altınova] Enable prettier on the docs-user markdown files (#5358) [Paul Adenot] Allow searching by Content-Type in the network marker view (#5351) [Florian Quèze] Hide the pid in global tracks if it is 0. (#5361) [Markus Stange] Make inverting the call tree fast, by computing inverted call nodes lazily (#4900) [Markus Stange] Use 64-bit floats for call tree timings. (#5371) [Markus Stange] Extend the workaround in the v53 upgrader to generate missing subcategory columns. (#5369) ## Also thanks to our localizers: de: Michael Köhler el: Jim Spentzos en-GB: Paul es-CL: ravmn fr: Théo Chevalier fur: Fabio Tomat fy-NL: Fjoerfoks ia: Melo46 it: Francesco Lodolo nl: Mark Heijl pt-BR: Marcelo Ghelman ru: Valery Ledovskoy sv-SE: Luna Jernberg, Andreas Pettersson tr: Grk uk: Іhor Hordiichuk zh-CN: Olvcpr423 zh-TW: Pin-guang Chen
Deploy preview
Fixes #467, fixes #337, fixes #3313.
┆Issue is synchronized with this Jira Task