Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dangling children from Screen #341

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chpatton013
Copy link
Contributor

@chpatton013 chpatton013 commented May 23, 2018

I am upstreaming some of the changes my team has made to nanovg and nanogui.

Note: I think this is a more thorough implementation of #332

We found that removing widgets during an event callback sometimes raised segfaults either later on during event processing or on the subsequent draw frame. A little bit of digging showed that Screen was holding onto references to widgets that may have been de-allocated.

Our solution was to invoke a slightly altered form of Screen::disposeWindow from removeChild to ensure mDragActive, mDragWidget, and mFocusPath do not contain references to deallocated pointers.

A quick summary of changes:

  • Assert that index in Widget::removeChild(int) is within range of [0, Widget::childCount - 1]
  • Make Widget::removeChildHelper: a common helper function for Widget::removeChild that operates on iterators.
  • Rename Screen::disposeWindow to Screen::disposeWidget, change the argument type to Widget*, and recurse to dispose of children as well.
  • Update the only reference to Screen::disposeWindow to call Screen::removeChild (which will eventually call Screen::disposeWidget now).

This may break a few users because:

  • We've added an assert that will fail at runtime for bad input that would previously go unchecked.
  • We've renamed a public function that some users may have been calling. Fortunately, they'll see this one at compile-time.

TODO

  • rebase to current master
  • move changes in py_doc.h to be a docstring comment in screen.h: /// Remove any internal references to a widget and its children.
    • run make mkdoc target to regenerate py_doc.h
  • decision needed about removeChildHelper vs removeChild(Widget *) Remove dangling children from Screen #341 (comment)
    • implement the decision if changes needed
  • disposeWidget(Widget *) => disposeWidget(const Widget *)

@chpatton013
Copy link
Contributor Author

@svenevs I'm fairly confident that this resolves the issue you were guarding against with #332 (thus, fixing #331).

@chpatton013
Copy link
Contributor Author

@wjakob I'd appreciate your feedback on this.

@carmel4a
Copy link

carmel4a commented Jun 25, 2018

Ok, I can confirm that in "my" project. I have had the same issue ( segmentation fault ) in Screen::mDragWidget and this commit fixes it for me. Thanks a lot @chpatton013 .

@chpatton013
Copy link
Contributor Author

@carmel4a Glad I could help!

@svenevs
Copy link
Collaborator

svenevs commented Jul 7, 2018

Hi @chpatton013 sorry for never responding to this. Here are my thoughts

  1. removeChildHelper(const std::vector<Widget *>::iterator& child_it) is a little awkward in my opinion. Why is it necessary? Currently there are already two Widget::removeChild, adding another makes things more convoluted. What I was trying to do in focus path on removed widgets gets set to parent #332 was have exactly one implementation, and the second method calls the first.

    I think there should be a way to still do that here.

  2. I think assert statements on index is a little too forceful, it would be better to (like in focus path on removed widgets gets set to parent #332) just not do anything. I view this as protecting users from themselves, but equally valid arguments about the assert can be made. (AKA this is an opinion-based point, I would rather not force the program to crash).

  3. In a nutshell, the whole purpose of disposeWindow / disposeWidget is simply to make sure that mFocusPath and mDrag{Widget,Active} are updated. So maybe there is a way that we can clean this up a little, making Screen the only class that overrides a (now) virtual void removeChild(const Widget *widget)?

    Since the Widget::~Widget will loop through and child->decRef() for all children, and this will eventually lead to a delete this, I think we don't even need disposeX. We just need to removeChild, which for something like a Window will ripple through to all of its kids, so I'm not sure the loop at the end of disposeWidget is even necessary. You changed it from removeChild though, why is that?

Basically, I agree this is more thorough than #332, but I think it could be cleaner if (1) is removed / (3) is incorporated. What are your thoughts?

I'm closing that PR, can you edit the text of the top-comment at the beginning to say Fixes #331. so that if / when merged, that issue will automatically be closed?

Edit: see also #198 and #262, all circling around this same issue.

@chpatton013
Copy link
Contributor Author

@svenevs I should have time this weekend to properly respond to your comment. I'm posting now to hold myself accountable to that :P

@chpatton013
Copy link
Contributor Author

1

removeChildHelper(const std::vector<Widget *>::iterator& child_it) is a little awkward in my opinion. Why is it necessary? Currently there are already two Widget::removeChild, adding another makes things more convoluted. What I was trying to do in #332 was have exactly one implementation, and the second method calls the first.

Not "necessary", but I found it to be the cleanest way to get code-reuse and performance across two different signatures. I won't dispute the convenience of just calling the pointer impl from the index impl, however I would point out the unnecessary use of std::remove in the context of the index impl. std::remove is a full-search through the container. std::vector::erase, on the other hand, starts at the designated position and memcpy's the rest of the container down to cover the removed elements. With that in mind, the simplest implementations these functions could have are what are there in master as of now:

void Widget::removeChild(const Widget *widget) {
    mChildren.erase(std::remove(mChildren.begin(), mChildren.end(), widget), mChildren.end());
    widget->decRef();
}

void Widget::removeChild(int index) {
    Widget *widget = mChildren[index];
    mChildren.erase(mChildren.begin() + index);
    widget->decRef();
}

Both of these implementations erase based on a calculated iterator (either std::remove() or begin() + index), so that seems like the lowest common denominator to start with shared code.

However, the pragmatist in me recognizes that there is a lot of extra code introduced to get a little bit of re-use. Since all I've needed to do is add a call to a single function in each of these implementations, it would be more succinct just to do that and keep the implementations separate. All that said, I personally (and without much conviction, mind you) do not think it is appropriate to coerce one of these implementations to call the other. If we were to stick with the iterator-based helper function there would naturally be a logical intersection point, but without it I think this just makes the relationship between them complicated.


2

I think assert statements on index is a little too forceful, it would be better to (like in #332) just not do anything. I view this as protecting users from themselves, but equally valid arguments about the assert can be made. (AKA this is an opinion-based point, I would rather not force the program to crash).

I prescribe to the "fail fast and hard" methodology of defensive coding, but I won't enforce that preference on someone else's code. I'm happy to replace those asserts with an early return.


3

In a nutshell, the whole purpose of disposeWindow / disposeWidget is simply to make sure that mFocusPath and mDrag{Widget,Active} are updated. So maybe there is a way that we can clean this up a little, making Screen the only class that overrides a (now) virtual void removeChild(const Widget *widget)?

Since the Widget::~Widget will loop through and child->decRef() for all children, and this will eventually lead to a delete this, I think we don't even need disposeX. We just need to removeChild, which for something like a Window will ripple through to all of its kids, so I'm not sure the loop at the end of disposeWidget is even necessary. You changed it from removeChild though, why is that?

This is the actual meat and potatoes of my PR. Everything else is just an implementation detail.

When a Widget is focused, it and its whole ancestry are included in Screen::mFocusPath and potentially Screen::mDragWidget. When a focused widget is removed from its parent (as of master), the widget may be deleted. However, nothing removes this same widget from Screen::mFocusPath or Screen::mDragWidget prior to deletion. This yields dangling pointers which are dereferenced by Screen during many of its member functions, leading to application crashes.

As you said:

In a nutshell, the whole purpose of disposeWindow / disposeWidget is simply to make sure that mFocusPath and mDrag{Widget,Active} are updated

With that in mind, it seemed like the appropriate function to call to clean up those member variables. Unfortunately, disposeWindow only operates on Window*, which were not at risk of being deleted out from under the Screen. The risk was a Widget* becoming a dangling pointer within Screen, so repurposing the existing function seemed appropriate to me.

So maybe there is a way that we can clean this up a little, making Screen the only class that overrides a (now) virtual void removeChild(const Widget *widget)? Since the Widget::~Widget will loop through and child->decRef() for all children, and this will eventually lead to a delete this, I think we don't even need disposeX. We just need to removeChild, which for something like a Window will ripple through to all of its kids, ...

I don't think that would have any effect, since this problem can occur when removing a widget from any parent. We need every call to Widget::removeChild to clean up removed widgets from Screen::mFocusPath and Screen::mDragWidget.

... so I'm not sure the loop at the end of disposeWidget is even necessary.

I think you're right about this. I had a whole big scenario typed out where this could still break, but I talked myself out of it. The loop is not necessary.

You changed it from removeChild though, why is that?

This is kind of a repurposing of the dispose{Window,Widget} function. Instead of being the special case for cleaning up the window, we can just use the "normal" removeChild function to do this for us (see updates to Window::dispose in src/window.cpp). So we change from dispose* calling removeChild to removeChild calling dispose*.


As I mentioned earlier, I don't have very strong convictions about the finer details here. The important result is that Screen no longer holds dangling references to deleted widgets. I thought the best way (both by ease of implementation and discoverability) to accomplish this was to hook into removeChild and make widgets responsible for notifying the screen that they needed to be forgotten about.

We currently have a couple of small changes that I think we're both in agreement on:

  • Change asserts in removeChild to an early return.
  • Get rid of the recursive loop in disposeWidget
    • This will also remove the ordering constraint of dispose, erase, and decRef in removeChildHelper.

And we still have a couple of open questions:

  • Is removeChildHelper a good use of the lines it occupies?
    • I could be swayed in either direction because we only need to make a single extra function call in removeChild. Any more than that and I would find greater value in the helper function.
  • Make Screen responsible for cleaning up mFocusPath and mDragWidget instead of `Widget.
    • I'm not sure how to accomplish that. Maybe you know something about this that I don't.

Let me know what you'd like to see on each of these points.

@chpatton013
Copy link
Contributor Author

@svenevs poke

@wardw
Copy link

wardw commented Apr 3, 2019

I've also just run into this. Thanks @chpatton013 for your fix - this looks good. I've made a further change to this (#389) that may resolve also some of the other points. Essentially it just enforces a Widget to remove itself from a Screen as a part of its destruction. Alas as it stands it's has it's own flaw, so for now i've left it as a draft.

@wardw
Copy link

wardw commented Apr 4, 2019

As an alternative to #389 I’ve also implemented the changes as suggested in this PR. This is probably the simplest fix with the least impact on the existing code. On the discussion points I went with the changes that make the fewest changes.

It makes the same point shared by @chpatton013 that it puts the responsibility on Widget to remove itself from the screen. I think the destructor version is probably still worth considering, but this has the least impact (it's only one scope further out) and is ready to go. Feel free to merge in wardw@53a5159 if that suits.

@svenevs
Copy link
Collaborator

svenevs commented Apr 5, 2019

@svenevs poke

@chpatton013 I'm sorry :( I'm re-investigating right now and comparing with the destructor approach by @wardw. I'm inclined to say that the remove* approach (this PR) is probably the right approach in terms of delegating responsibility to the right location. Will post back by end of Sunday (it's in my calendar :))

@svenevs
Copy link
Collaborator

svenevs commented Apr 8, 2019

  • Change asserts in removeChild to an early return.

I think it's ok to leave the asserts, there's plenty of other assert in the framework.

  • Get rid of the recursive loop in disposeWidget
    • This will also remove the ordering constraint of dispose, erase, and decRef in removeChildHelper.

I think we do need it. Say we have Screen -> A -> B -> C, meaning Screen has one child A, A has one child B, B has one child C. In this scenario, suppose C is the current mDragWidget. In the event that a non-mouse-related (e.g., keyboard event) triggers A->removeChild(B), C would get removed from the focus path, but still remain the active drag widget. For focus path I'm fairly certain that in order for C to appear, it's parent B must appear meaning that in A->removeChild(B) it will get cleared. But mDragWidget will not.

Do you agree?

  • Is removeChildHelper a good use of the lines it occupies?
    • I could be swayed in either direction because we only need to make a single extra function call in removeChild. Any more than that and I would find greater value in the helper function.

Personally I prefer having removeChild(Widget *widget) do the important work, and making removeChild(int) as a proxy to just lookup the index and call removeChild(mChildren[index]). But I'm biased because that's what I did in #332, it just feels more direct. It's basically null check on one hand, or end() check on the other hand. I 100% agree that we need a "single source of truth" here.

Edit: the approach in #332 is a little more wasteful than the iterator, it's just easier to look at / understand. Emphasis: I just want to get this issue fixed :)

@wjakob what would you prefer? Summary:

  1. removeChild(int) and removeChild(Widget *) both call removeChildHelper(const iterator ref), which handles the dispose details.
  2. removeChild(int index) calls removeChild(Widget *) after asserting a valid index, and removeChild(Widget *) handles the dispose details.
  • Make Screen responsible for cleaning up mFocusPath and mDragWidget instead of Widget.
    • I'm not sure how to accomplish that. Maybe you know something about this that I don't

Disregard this. I was very wrong...


@chpatton013 I'm adding a task list to the message at the top to indicate what needs to happen before asking @wjakob for final review so that you can edit it. I am hesitant to rebase / make changes myself by committing directly to the branch because I was not sure if you are actively using it (aka if you green light me I can rebase / do the pydoc stuff for you if you like).

2048khz-gachi-rmx added a commit to 2048khz-gachi-rmx/nanogui that referenced this pull request Dec 17, 2022
github is being fucking stupid (or i am)
2048khz-gachi-rmx added a commit to 2048khz-gachi-rmx/nanogui that referenced this pull request Dec 17, 2022
yea, i force pushed. sue me
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants