Skip to content
This repository has been archived by the owner on Mar 27, 2021. It is now read-only.

Investigate & resolve nondeterministic build errors #753

Open
sming opened this issue Feb 3, 2021 · 1 comment
Open

Investigate & resolve nondeterministic build errors #753

sming opened this issue Feb 3, 2021 · 1 comment
Assignees
Labels
build issues issues with library versioning, compilation warnings and so on codebase quality high level issues pertaining to improvements / problems with the codebase's quality

Comments

@sming
Copy link
Contributor

sming commented Feb 3, 2021

  • I am a heroic developer
  • Who wants to have deterministic builds
  • So that I can retain my sanity and be productive instead of chasing random build errors

Design & Implementation Notes

feature/add-bigtable-timeout-settings-refactored com.spotify.heroic.GrpcClusterQueryIT > distributedFilterQueryTest FAILED
    java.lang.IllegalStateException: failed to create a child event loop
        Caused by:
        io.netty.channel.ChannelException: failed to open a new selector
            Caused by:
            java.io.IOException: Too many open files
    java.lang.NullPointerException
@sming sming self-assigned this Feb 3, 2021
@sming sming added build issues issues with library versioning, compilation warnings and so on codebase quality high level issues pertaining to improvements / problems with the codebase's quality labels Feb 3, 2021
@sming sming unassigned sming Feb 3, 2021
@sming sming self-assigned this Mar 24, 2021
@sming
Copy link
Contributor Author

sming commented Mar 24, 2021

Update

found that removing the 4x multiplier from this method :

        @Provides
        @GrpcRpcScope
        @Named("worker")
        fun worker() = NioEventLoopGroup(Runtime.getRuntime().availableProcessors() * 4)

stops the exception from being thrown. But there are still many questions unanswered:

  1. why only my and Sergey's machines
  2. why does commenting out a seemingly innocuous IT (testbasicWithNoDistribution) also stops exception from being thrown
  3. is the problem we're encountering pertinent to production operation of Heroic or is it a quirk of our Machines or just something that unit test code will exhibit
  4. what is the "correct" fix for this. Removing the 4x multiplier is a poor workaround at best.

(CC @malish8632)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
build issues issues with library versioning, compilation warnings and so on codebase quality high level issues pertaining to improvements / problems with the codebase's quality
Projects
None yet
Development

No branches or pull requests

1 participant