Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid generating the same chunk more than once with multiple emerge threads #10637

Merged
merged 1 commit into from
Nov 27, 2020

Conversation

lhofhansl
Copy link
Contributor

@lhofhansl lhofhansl commented Nov 16, 2020

Current MT.

I noticed that with multiple threads the same map chunk is generated multiple times.
Instead if the same chunk is already in progress the current emerge request should be canceled (and will be retried later).

Please review this carefully, this is tricky stuff.

(There are many more problems: the Server::m_env_mutex is locked in many place, slowing down emerging, for example while some blocks are saved to disk. But that's for another PR.)

How to test:
Start with a new world (or delete map.sqlite or change the backend to "dummy"), so that blocks need to be generated.
Set num_emerge_threads to 0 or some explicit number. Then fly around (and observe how the world is now loaded faster than without this patch).

Or try this:

minetest.register_on_generated(function(minp, maxp, seed)
minetest.chat_send_all("Gen: " .. minp.x .. ',' .. minp.y .. ',' .. minp.z)
end)

Before this PR you will see many duplicate calls for the same map chunk. With this PR you wont.

src/map.cpp Show resolved Hide resolved
@paramat paramat added @ Mapgen Bugfix 🐛 PRs that fix a bug labels Nov 16, 2020
@lhofhansl
Copy link
Contributor Author

The part of review carefully is the fact the emerge of a block is canceled when the chunk containing the block is already being processed - in the assumption that emerge of the block would be retried later.

Now all Lua hooks, etc, still need to be run. Please review it with this in mind.

@lhofhansl
Copy link
Contributor Author

As for perf... I started looking at this when I noticed that generation is almost 10x slower than reading a block from disk. 10x! I was not expecting that. And more emerge threads hardly made a dent in the time.

I tried with a dummy-backend, reloading the same scene over and over. For the large viewing_range I tried with 14 emerge threads (hyperthreads - 2) this reduced the generate time from about 90s to about 60s. (For comparison loading the scene from disk takes less than 10s.)

Mapgen is V7.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Nov 16, 2020

Where I think it might go wrong is this:

  1. block A is scheduled to be emerged.
  2. block A's chunk is loaded.
  3. now block B - in the same chunk is scheduled.
  4. block B's emerge is canceled since its chunk is being created.
  5. as part of the chunk creation block B is now created as well.
  6. block A is activated
    7. is block B ever activated? Next time we look for it, it will already have been created as part of the block A's emerge

Edit: Actually it's fine. The block is activated if/when it is later loaded as an active block... Just like it would be in the single threaded case.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Nov 16, 2020

I think a better solution is to queue the chunk loading, then record all blocks in that chunk requested before it is done, and then activate these blocks once the chunk has been loaded. Bit more involved, though.

Edit: This should be good.

@lhofhansl lhofhansl removed the WIP label Nov 16, 2020
@lhofhansl
Copy link
Contributor Author

In case it's not clear, if the problem above existed it would already be a problem.
If any block is loaded as part of the chunk of another block emerge request it is not activated as part of the process. With or without this change.

As explained, though, that block is then activated by the ABM handler if/when needed.

@lhofhansl
Copy link
Contributor Author

It turns out that the actual BlockMakeData::blockpos_requested is not used anywhere.
(Other than to assert to be inside the chunk and the chunk is created from the blockpos anyway).
It's probably a historical left-over.

With that it is clear, though, that chunk should only be created once, on_generated be only once in Lua, etc.
This patch makes it so. For clarity and to avoid future confusion I removed blockpos_requested.

@lhofhansl
Copy link
Contributor Author

Tested in all kinds of scenarios. The behavior is the same as now, but it stops wasting CPU cycles and also avoid calling the Lua on_generated hook multiple times. Without this I question the usefulness of multiple emerge threads since nearly blocks are often generated together and thus a high chance of duplicate work.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Nov 24, 2020

I found a difference in the light/shadow calculation after all. Sigh.

Instead of cancelling the request it should be pushed back on the queue to be picked up later when the other request is finished.

Update: Even when I re-schedule the same request again later, the light is not the same as in the single threaded case. If I let all the duplicate requests go through at the same time - with multiple loads of the same chunks the light looks like in the single-threaded case.

But if I let the requesting thread wait for the other thread to create the chunk, then it is correct. That, obviously would still reduce the usefulness of multiple emerge threads. There is something really strange going on here. Meh.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Nov 24, 2020

Arrgghhh... The never-ending sage... I can produce the same difference if I just position my camera the right way (without this PR and with a single emerge thread). I ran into a general map gen bug with lighting.

I have to position the camera so that the block above is just not visible. In that case it is not used for the shadow spread.

I do not know how to fix it.

Now we have a choice to make. The sunlight/shadow code seems to be general fickle and bad. Sometimes the order in which the blocks are loaded is significant - even though it is nowhere guaranteed. This change changes the order depending on which thread makes it first. Sometime this can be wrong, something it can be right. I happen to run into a case where it's wrong.

With difference I am seeing I cannot tell which one is right and which is wrong. And I can cause the same by just looking at the scene in a different way.

Calculating the same chunk multiple time and triggering the Lua code multiple times for the same code is bad.

I would still suggest merging this.

Edit: So it's not a bug. (I keep saying this. Someone else please play around with this.)

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Nov 24, 2020

Here's the issue. This is WITHOUT this change. Notice how the only difference is my y position.
With this change is more likely to render the second option. fixlight will change it to the first option - so that seems to be correct one.

screenshot_20201124_103214
screenshot_20201124_103238

@lhofhansl lhofhansl added the WIP label Nov 24, 2020
@lhofhansl
Copy link
Contributor Author

More issues.... See the tree that is not generated (this is with this PR and multiple emerge threads)
screenshot_20201124_104013

There is something happening on block or chunk boundaries that I do not, yet, understand.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Nov 25, 2020

@paramat I see that each emerge thread gets its own copy of the MapGen. Is it guaranteed that as long as all MapGens are setup with the same parameters that they will always produce exactly the same result for the same MapChunk/block.

For example, could a call to makeChunk setup some cache that a future call of makeChunk would use?

I did try ensuring that a particular block is always handled by the same thread, but that did not change anything.

Edit: Even without this change I see flickering in how the map generated when using multiple emerge threads. You (very) briefly see one version, then as mapgen is called again (since this is without my change) the block is generated slightly differently. My change simply picks the first version, and there seems to be nothing right or wrong about that.

@paramat
Copy link
Contributor

paramat commented Nov 25, 2020

Here's the issue. This is WITHOUT this change. Notice how the only difference is my y position.

You probably already know this, but ...

You are near the border between mapchunks, y = 48. Small changes in your vertical position are probably altering the order of generation of the lower and upper mapchunks.
Shadow not being propagated downwards in the lower mapchunk is expected and something that happens 50% of the time. It happens when the lower mapchunk is generated before the upper mapchunk, because when the lower mapchunk is generated there is not yet any shadow above it to propagate downwards, it is unknown what the terrain will be like in the upper mapchunk.

So anyway, i am just pointing out that this shadow glitch is expected and already very common and so not a problem with this PR as long as it does not happen far more often.

With this change is more likely to render the second option.

More likely In what situation? Only when the player is very close to y = 48? Or in all situations? And how much more likely?

fixlight will change it to the first option - so that seems to be correct one.

The first screenshot is visually preferable of course, but it is not actually always 'correct' mapgen behaviour because this shadow glitch is the expected behaviour if the lower mapchunk is generated before the upper one, which happens 50% of the time.
So it is important to realise that fixlight does not always create the expected and 'correct' behaviour of mapgen lighting.

See the tree that is not generated (this is with this PR and multiple emerge threads)

The tree is part of the lower mapchunk generation but extends over the mapchunk border into the upper mapchunk.
Many aspects of mapgen extend beyond the mapchunk borders by up to 16 nodes: caves, dungeons, decorations, and this is essential.
Trees getting cut off at mapchunk borders when using multiple emerge threads is a known issue, see #5618 , it has always happened, but rarely, when using multiple emerge threads, so not a problem as long as it is still rare.

Is it guaranteed that as long as all MapGens are setup with the same parameters that they will always produce exactly the same result for the same MapChunk/block?

Possibly not, as mapgen differs according to which neighbour mapchunks are already generated, due to many structures extending beyond the mapchunk borders by up to 16 nodes (caves, dungeons, decorations).

@lhofhansl
Copy link
Contributor Author

Thanks @paramat . It seems then that mapgen is fundamentally incompatible with generating blocks in parallel.

Can one know which other chunks need to be generated first in order for the "current" chunk to be correct? If so, one do a topological sort to get the order right.

@lhofhansl lhofhansl removed the WIP label Nov 26, 2020
@lhofhansl
Copy link
Contributor Author

Not planning more work in this. The fix itself is just 3 lines of code. I spent a lot of time seeing where the map glitches are coming from, and I do not see a way to improve this. Multi-threaded emerging causes some glitches. This makes some worse and some less - depending on whether the first rendering attempt of a chunk is the right one or not.

@paramat would love if you could test this a bit as well.

@paramat
Copy link
Contributor

paramat commented Nov 26, 2020

See #5618 (comment)
I think the warning in emerge threads documentation needs improvement, perhaps something to do in this PR?

@paramat
Copy link
Contributor

paramat commented Nov 26, 2020

It seems then that mapgen is fundamentally incompatible with generating blocks in parallel.

I suspect this too (but would state that the other way around =), this is hinted at in past documentation of num_emerge_threads, see #5618 (comment) for more details.
It seems that multiple emerge threads causes occasional bugs with overgenerating structures: decorations, randomwalk caves, dungeons, 1-node terrain slices at mapchunk borders.

Can one know which other chunks need to be generated first in order for the "current" chunk to be correct?

I am not sure, even if so i get a feeling this would be a nightmare.
Overgeneration happens in all directions so i suspect any generation order would cause bugs.

This makes some worse and some less - depending on whether the first rendering attempt of a chunk is the right one or not.

Seems acceptable.

You (very) briefly see one version, then as mapgen is called again (since this is without my change) the block is generated slightly differently. My change simply picks the first version, and there seems to be nothing right or wrong about that.

I agree, as far as i understand this.

Copy link
Collaborator

@sfan5 sfan5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked through the code, didn't seem to violate any assumptions.
Tested with 12 emerge threads, no apparent issues.

@lhofhansl lhofhansl merged commit f1d72d2 into luanti-org:master Nov 27, 2020
BuckarooBanzay added a commit to pandorabox-io/pandorabox.io that referenced this pull request Nov 28, 2020
Includes lot of changes in regards to map loading, sending and saving
Engine commit: luanti-org/luanti@f1d72d2

Important changes:
* luanti-org/luanti#10670
* luanti-org/luanti#10637
* luanti-org/luanti#10616
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants