Refactor/More proper use of DI for Synchronizer #7500

asdacap · 2024-09-27T08:55:02Z

Introduce DI

Why?

Reproducable service construction.
- This mean test and production use the same code to construct class.
- This reduces mistake and boilerplate.
- I'm sure many of you have experience the impressive amount of code needed to create integration tests. And those impressive code need to be updated everytime we change a constructor. And those code are not necessarily the same code used in prod.
Easier code change.
- This include refactor, new functionality or removing functionality.
- We have many boilerplate code that just construct classes. These code create friction everytime we add or remove functionality.
- I'm sure many of you have experience that changing a class means changing 6 other files that just wire the main class that you are changing. And don't forget integration tests. One or two class is fine, but the codebase is getting ridiculous, and more importantly, this add friction.
Lazy initialization and automatic dispose.
- Instead of figuring out which component we need to disable whenever we just want to do one thing, for example verify trie, or era import/export, with dependency injection we just need to resolve the component that we need and only the required dependencies is created.
- This saves a lot of time double checking what component is accidentally run unnecessarily, not just now, but in the future also.
- Ow, did I mention that it automatically dispose resolved service in the right order?
- Yes! It does! So you also, don't have to worry about how to dispose the service (and its dependencies!).
Pluggability
- We added hooks within the initialization for plugins to use.
- But these hooks are fixed. When a plugin can't achieve what it need to do, it mutates code in unexpected places, like the merge plugin modifying header validator in InitializeNetwork.
  - What if header validator is used before InitializeNetwork? Maybe not now, but in the future maybe. Maybe only some of the code uses the original header validator.
  - Automatic DI automatically resolve which class depends on which class so this is automatic and more importantly, reduces mistake.
  - And remember, what about integration tests?
- We want to move more code into plugins, but what would the interface of the plugins look like? The initialization order may not be the same with one combination of plugins versus the other.
- With dependency injection, its relatively easy to just extract TrieStore or Paprika or Portal or DiscV4/5 into an Autofac Module. The next question is then which part do you want to make plugin (and to be clear, you dont have to), not which hook to add.

Q&A

Do all code need to support dependency injection.
- No, no. You can always wrap it around a big class like CompositeDiscoveryApp, and that big class can be resolved with DI.
- But then whats the point?
  - Look, if it helps you, great. If it does not, then just don't. We don't have to do everything in one way.
  - However, it is in my opinion, that the application's subsystem's integration with each other should use dependency injection due to the benefit outline above and due to the sheer size of the application.
  - Then said, most part of the code can benefit using automatic dependency injection, and from what I can gather, most of the team member support dependency injection. Its impressive that we still don't have any sort of dependency injection in the code.
The configuration for dependency injection can be complicated maybe it will make the codebase more complicated.
- Perhaps, in fact I was originally against adding it.
- But IMO the codebase is already complicated and too large to handle without it, this move the complication to other part.
- However, even IF it is more complicated, you still get:
  - Reproducability in test.
  - Pluggability.
  - Easier code change.
  - Lazy initialization.
This PR only modify some of part of the codebase while kinda hackily inject the rest of the dependency into the container. What about the construction of the dependencies?
- The goal is to be able to support dependency injection on different initialization code slowly, building Module along the way.
- There will be multiple IContainer and this IS an antipattern.
- But we are starting to create the configuration which can be easily plugged in towards a more complete solution and that is the point.
  - On my last attempt, most of the initialization code is replaced, with Module, which is the point, but we have a lot of them. Putting one (or really, a few) huge pr is gonna conflict with other code. The way I see it, it must be done slowly, or we won't have it at all.
- Even now, it can remove some unnecessary boilerplate, see (or no longer see) IBlockDownloaderFactory.cs.
A lot of code is not compatible with DI because of mismatched constructor or mutating config/dependency at runtime or nullable dependency. We should modify them first.
- Firstly, knowing if it is compatible with DI or not is hard to notice at first without actually migrating it to DI.
  - Constructor conflict can be mitigated, by simply not changing it and passing its dependency one by one, which sounds pointless, but remember, you still got the rest of the benefit of DI like reproducable build and such.
  - The bigger problem is plugin integration which mutate INethermindApi, cyclic dependency, and lifetime.
  - And really, you won't realize them UNTIL you try to migrate.
- So really, either try to move it to DI first without refactoring them, or refactor them at the same time, or don't refactor them at all.
- Also, its easier to refactor code with dependency injection in the first place as you don't have to update 6 different place when you change the code.
- I'd say, just integrate it first imperfectly, THEN clean it up.
This constructor clearly can use ISyncFeed[] and we can loop them all and wait for them. And we can just don't declare some of the feed instead of passing in a NoopSyncFeed.
- Its going to be glaringly obvous that certain thing can be much easily or cleanly done some other way.
- But its glarily obvious because this PR introduce DI.
- Help me out here, let this PR land first, then that can be clean up and more importantly verify and test if it still works.
- DI make it easier to refactor. But it still need to work! Thats why I don't want to change too many things at once.
This <insert class name here> class can also be dependency injected.
- Look, help me out over here.
- Like, really, once it touch a class which can be dependency injected, its going to touch another class, then that class can be dependency injected, then the whole InitializeNetwork can be dependency injected, then the MergePlugin can be injected and so on.
- Which is great because you now have the drive to migrate things. Help me by doing exactly that... on a different PR, because trust me, moving everything is a lot code and a lot of work, (the integration test! think of the integration tests!), and if we make this PR large, its going to start to conflict with other PR.
- Plus someone else can start adding DI on another part of the code in parallel.
- If you feel itchy with InitializeNetwork specifically you can look at this branch which migrate the whole InitializeNetwork.
Test code shows that now Synchronizer is completely opaque: there is no way to know that we have to build a IServiceCollection with those specific services without looking into the implementation..
- If its dependencies are also declared nicely then we DONT have to build them. Currently, this is an imperfect implementation, thats why there are list of them. This will be the case, until we migrate enough of the code.
- That said, it still can know which service is required because it will fail at runtime, which IS a downside.
- But think of the upside too, the wiring of the test can be made automatic AND use the same code as prod initilizer code, see this, this, this.
- Keep in mind this test is an integration tests, it must construct those other classes.
Give me a vision.
- There are no INethermindApi.
- Some IStep is removed as it only construct but not run.
- A new IStepMeta is used to declare steps, its dependency and dependents.
  - Can and should be declared by plugins.
  - The IStep for the IStepMeta is only resolved on its turn, meaning that if the step does not need to be constructed if it does not need to run.
  - There is now a final FullAppStep that starts other needed stops. Maybe you just need VerifyTrieStop, or EraImportStep.
  - You are no longer anxious that the RPC or the network or the special consensus client is started, or if the state need a special network bootup for some reason when you just need to provide simple import export.
- Plugins are essentially Autofac's Module, registered after all core module, lets call it NethermindModule, which essentially have all components needed to start nethermind.
- Integration tests, so high level, its like hive test, just faster, can be created with a container that have a final module, TestEnvironmentModule which replaces db and network.
- Same set of tests can be used with a different plugin by simply registering the plugin's module after NethermindModule, before TestEnvironmentModule.
- Paprika is a Module, pruning trie store is a Module, portal state store is a Module.

Changes

This time it resolve ISynchronizer completely from container.
Also INethermind*Api would fetch from container if implemented. Since ISynchronizer also forward ISyncModeSelector and ISyncProgressResolver, might as well directly get them from container also.
This provider a pretty straightforward way of adding DI little by little without moving everything, however, having a full DI can be hard to comprehend. See how full/fast sync and fast/beacon header is handled here. But at the same time it allows plugin to have a single point of configuration see MergeSynchronizer.ConfigureMergeComponent.
Although more complicated, AutoFac also brings automatic niceness like automatic ILogger, and IConfig, and IDb and friends, but not implemented here.

Types of changes

What types of changes does your code introduce?

Refactoring

Testing

Requires testing

Yes
No

If yes, did you write tests?

Yes
No

Notes on testing

…operer

LukaszRozmej

I understand that this is a stop-gap to start moving into DI direction.
That being said I am confused about two things:

Multiple containers - I kind of don't see how that helps to achieve the goal? If the goal is to have seamless DI than this feels like counter productive. While it is fine not to migrate everything at once, how we should handle container(s) is the major design decision that we should investigate upfront and have some sensible answer (even if we change it later). Otherwise I think this doesn't achieve that much and potentially in itself increases complexity rather than decreases it?
Direct registrations vs module registrations - this is probably unavoidable, but I would expect to understand the divide better, but I don't at the moment.

Would love to hear more opinions and reviews before we progress.

src/Nethermind/Nethermind.Api/IApiWithNetwork.cs

src/Nethermind/Nethermind.Db/IDbProvider.cs

LukaszRozmej · 2024-10-01T09:24:21Z

src/Nethermind/Nethermind.Merge.Plugin/Synchronization/MergeSynchronizer.cs

+public class MergeSynchronizer(
+    [KeyFilter(nameof(BeaconHeadersSyncFeed))] SyncFeedComponent<HeadersSyncBatch> beaconHeaderComponent,
+    ISyncConfig syncConfig,
+    Synchronizer baseSynchronizer,


Suggested change

Synchronizer baseSynchronizer,

ISynchronizer baseSynchronizer,

?

Need the WireFeedWithModeSelector which does not fit nicely into ISynchronizer. Alternatively, could copy the whole thing, which is also, not very nice.

asdacap · 2024-10-01T10:25:29Z

The end goal is in fact single container. The problem is that, we don't have complete components.
- For example, let say we have service, a, b, c, constructed in step A, B, C. b depends on a, and c depends on b.
- Let say b is not migrated, but is used by c.
- If we use single container, during the container building (which is gonna happen in A), b is not gonna get constructed yet at - that point, so it is not in the container and so c initialization will fail.
One solution is to have different lifetime for c where during the configuration of the lifetime in C, it would pick up b.
Another solution is to have more magic with Autofac that might solve by fetching b again at runtime with a custom registration source or something.
Another solution is to wait until b is migrated and when that happen, we can combine the three container into one.

asdacap · 2024-10-01T10:28:45Z

I like the approach that each registration is done in a module. It looks nicer and make it easier to reuse in tests. I just don't do it completely here.

Co-authored-by: Lukasz Rozmej <[email protected]>

emlautarom1

Highly appreciate the detailed summary.

I'm sure many of you have experience that changing a class means changing 6 other files that just wire the main class that you are changing.

Agree, this is rather annoying but it's not a dealbreaker in my opinion.

Lazy initialization and automatic dispose.

I think this is a killer feature, in particular lazy initialization that removes the need to keep track which things are initialized first.

Pluggability

I also think this is a really important feature considering how we're moving towars a plugin-based architecture. We had to fall back to static properties during the Transaction refactoring (#7313) due to the lack of hooks or required initialization order.

The configuration for dependency injection can be complicated maybe it will make the codebase more complicated.

I don't think so. Initialization code is already complicated and has a lot of ad-hoc mechanisms like "WaitForX" to make it work. A unified DI container based approach should be more manageable.

Even now, it can remove some unnecessary boilerplate

This is good. We're closing to ~300.000 LOC and any deletion is welcomed.

Help me out here, let this PR land first, then that can be clean up and more importantly verify and test if it still works.

I'll approve since I agree with the vision and the proposed PR contains reasonable, localized changes (as you mention, let's go slowly step by step).

The end goal is in fact single container. The problem is that, we don't have complete components.

For example, let say we have service, a, b, c, constructed in step A, B, C. b depends on a, and c depends on b.

Let say b is not migrated, but is used by c.

If we use single container, during the container building (which is gonna happen in A), b is not gonna get constructed yet at - that point, so it is not in the container and so c initialization will fail.

I don't fully understand the example. Do we have a and c migrated? If so, I think the problem is that we migrated in the wrong order, that is, start from the leaves and move upwards (c, then b, then a). Maybe you could elaborate further?

Regarding the proposed solutions to this problem, I would avoid the introduction of lifetimes or Autofac magic for these intermediate refactoring steps.

src/Nethermind/Nethermind.Api/NethermindApi.cs

src/Nethermind/Nethermind.Synchronization/ParallelSync/NoopSyncFeed.cs

asdacap · 2024-10-03T00:09:36Z

Well, yes. But the point is to allow migrating out of order.

flcl42

Could you provide like a big picture of into what specific modules and components you would split the system

src/Nethermind/Nethermind.Synchronization/Synchronizer.cs

alexb5dh · 2024-10-03T13:21:23Z

src/Nethermind/Directory.Packages.props

@@ -6,6 +6,8 @@
    <PackageVersion Include="AspNetCore.HealthChecks.UI" Version="8.0.1" />
    <PackageVersion Include="AspNetCore.HealthChecks.UI.Client" Version="8.0.1" />
    <PackageVersion Include="AspNetCore.HealthChecks.UI.InMemory.Storage" Version="8.0.1" />
+    <PackageVersion Include="Autofac" Version="8.1.0" />


Why is Autofac needed on top of Microsoft.Extensions.DependencyInjection? If because of modules - can it be replicated by using separate IServiceBuilders instead?

This would be my question as well. I believe Autofac provides named registrations while Microsoft.Extensions.DependencyInjection may lack it. Not sure if this is the case.

I was wrong. ServiceKey etc are the way to go.

For named registrations, there's IKeyedServiceProvider.

Same reason as outlined here #6483.

I originally wanted to stick with Microsoft.Extensions.DependencyInjection, but can't make it work. See #7485.

src/Nethermind/Nethermind.Api/IApiWithBlockchain.cs

Scooletz · 2024-10-03T14:47:24Z

I like the idea of automatic composability. It's pleasing.
I don't understand why Autofac and go straight with Microsoft.Extensions.DependencyInjection. What is the reason?
I think that the initial proposition should go with one container and there should be a limitation of having a single one. This will increase the cost of adoption but will never leave oh there are 5 so I can add one more.
This PR promises deletion and less configuration yet it doubles the registrations and resolutions in tests. To sell this idea, IMHO, one needs to remove some dirt from test initialization.

I think that in its current form, this brings not that much. More deletions, less configuration in tests, one container. That would be my goal.

Co-authored-by: Alex <[email protected]>

asdacap · 2024-10-03T16:17:59Z

Autofac because it has more feature, which can be a downside that is true. Specifically the main one is the ability to create and reconfigure hierarcy of lifetime. In this example, it is used to support full sync and fast sync, which use the exact same type set of service but only change IFeed. If its only one class, then its fine, can just use named type. But when it become a graph of them then it become a problem. This is needed in nethermind, specifically with block processing/block producer using the same type.

Also, with a simple registration source (not in this PR), you can inject ILogger directly instead of ILogManager. Which is nice. Same thing with IConfig.

I tried single container before, the cost is severe. See #6483, #6531. At least with multiple container at first, we would already have the module configuration.

More reduction in boilerplate can be seen here https://github.com/NethermindEth/nethermind/pull/7520/files#diff-2d1d749000abfdbd86d6f5187b2765d94b7e83dcb99dcb55f9f4d51fecf08c16. That will be after this PR.

asdacap · 2024-10-03T16:36:56Z

Could you provide like a big picture of into what specific modules and components you would split the system

In the last attempt there is a:

BaseModule
DatabaseModule
CoreModule
StateModule
BlockchainModule
NetworkModule
KeyStoreModule

Plugins are literally more module. I guees whats left is block producer and rpc. And network was not complete at that time..

asdacap added 25 commits September 24, 2024 22:29

Seems nice

4df93ed

Snap provider

4b3cc69

Sync dispatcher

479a3f8

Dedup

0a52825

Multi sync mode selector

3144f3b

Fast and full sync

c85f3e8

The merge synchronizer

c126f95

Use reflection

5d21826

Fic test

3d90236

Minor cleanup

b9dac97

Remote block downloader

0f5343d

Reduce constructor parameter

1181068

Add unit tests

54facc1

Cleanup

67fe004

Whitespace

aecaea1

Pass through the servide collection

867f1f3

Use a need to wait for header config

cb82a56

Remove more manual construction

7b2b7ec

Whitespace

13e167f

Merge remote-tracking branch 'origin/master' into feature/minor-di

7ba6845

Do this more properly

9b0cf3d

Set sync mode selector from container directly

b06b200

Get component directly from container

8034622

Remove unnecessary code

915047e

I thought I specifically fixed this before

e4b3baa

asdacap requested review from jmederosalvarado and emlautarom1 September 27, 2024 11:11

Merge remote-tracking branch 'origin/master' into feature/minor-di-pr…

fd8845e

…operer

asdacap marked this pull request as ready for review September 27, 2024 11:59

asdacap requested a review from rubo as a code owner September 27, 2024 11:59

asdacap added 4 commits September 30, 2024 20:31

Fic build

ee4cf91

Merge branch 'fix/build-on-nodeloaders-test' into feature/minor-di-pr…

69bcc78

…operer

Merge remote-tracking branch 'origin/master' into feature/minor-di-pr…

dfd0f34

…operer

Merge remote-tracking branch 'origin/master' into feature/minor-di-pr…

f8612f4

…operer

LukaszRozmej reviewed Oct 1, 2024

View reviewed changes

asdacap and others added 3 commits October 1, 2024 18:31

Update src/Nethermind/Nethermind.Api/IApiWithNetwork.cs

399fe76

Co-authored-by: Lukasz Rozmej <[email protected]>

Update src/Nethermind/Nethermind.Db/IDbProvider.cs

2083c73

Co-authored-by: Lukasz Rozmej <[email protected]>

Update src/Nethermind/Nethermind.Db/IDbProvider.cs

101b779

Co-authored-by: Lukasz Rozmej <[email protected]>

emlautarom1 approved these changes Oct 2, 2024

View reviewed changes

src/Nethermind/Nethermind.Api/NethermindApi.cs Show resolved Hide resolved

src/Nethermind/Nethermind.Synchronization/ParallelSync/NoopSyncFeed.cs Show resolved Hide resolved

Address comment

54680c7

LukaszRozmej requested review from MarekM25, benaadams, flcl42, ak88, alexb5dh, Marchhill, deffrian, Demuirgos, marcindsobczak, damian-orzechowski and smartprogrammer93 October 3, 2024 11:52

flcl42 reviewed Oct 3, 2024

View reviewed changes

src/Nethermind/Nethermind.Synchronization/Synchronizer.cs Show resolved Hide resolved

src/Nethermind/Nethermind.Synchronization/Synchronizer.cs Show resolved Hide resolved

alexb5dh reviewed Oct 3, 2024

View reviewed changes

Update src/Nethermind/Nethermind.Api/IApiWithBlockchain.cs

c0dd797

Co-authored-by: Alex <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/More proper use of DI for Synchronizer #7500

Refactor/More proper use of DI for Synchronizer #7500

asdacap commented Sep 27, 2024 •

edited

Loading

LukaszRozmej left a comment

LukaszRozmej Oct 1, 2024

asdacap Oct 3, 2024

asdacap commented Oct 1, 2024

asdacap commented Oct 1, 2024

emlautarom1 left a comment

asdacap commented Oct 3, 2024

flcl42 left a comment •

edited

Loading

alexb5dh Oct 3, 2024 •

edited

Loading

Scooletz Oct 3, 2024

Scooletz Oct 3, 2024

rubo Oct 3, 2024

asdacap Oct 3, 2024

Scooletz commented Oct 3, 2024

asdacap commented Oct 3, 2024

asdacap commented Oct 3, 2024

	Synchronizer baseSynchronizer,
	ISynchronizer baseSynchronizer,

Refactor/More proper use of DI for Synchronizer #7500

Are you sure you want to change the base?

Refactor/More proper use of DI for Synchronizer #7500

Conversation

asdacap commented Sep 27, 2024 • edited Loading

Introduce DI

Why?

Q&A

Changes

Types of changes

What types of changes does your code introduce?

Testing

Requires testing

If yes, did you write tests?

Notes on testing

LukaszRozmej left a comment

Choose a reason for hiding this comment

LukaszRozmej Oct 1, 2024

Choose a reason for hiding this comment

asdacap Oct 3, 2024

Choose a reason for hiding this comment

asdacap commented Oct 1, 2024

asdacap commented Oct 1, 2024

emlautarom1 left a comment

Choose a reason for hiding this comment

asdacap commented Oct 3, 2024

flcl42 left a comment • edited Loading

Choose a reason for hiding this comment

alexb5dh Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

Scooletz Oct 3, 2024

Choose a reason for hiding this comment

Scooletz Oct 3, 2024

Choose a reason for hiding this comment

rubo Oct 3, 2024

Choose a reason for hiding this comment

asdacap Oct 3, 2024

Choose a reason for hiding this comment

Scooletz commented Oct 3, 2024

asdacap commented Oct 3, 2024

asdacap commented Oct 3, 2024

asdacap commented Sep 27, 2024 •

edited

Loading

flcl42 left a comment •

edited

Loading

alexb5dh Oct 3, 2024 •

edited

Loading