-
Notifications
You must be signed in to change notification settings - Fork 37
handle relay addr dialing intelligently #118
base: master
Are you sure you want to change the base?
Conversation
1422eb8
to
7f178e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just feedback on the design, not on the code. I actually dislike pacing dials based on hardcoded time delays. It feels we're artificially penalising nodes with good networks.
I'd prefer a feedback loop (like in dialer v2), where the scheduling component receives the result of the dial, and reacts by emitting a new job if necessary.
That way we pace dials by reaction, not by assumption.
7f178e1
to
e55ea43
Compare
Implementing the above feels like we'd be poor-man mimicking dialer v2. I think I have the bandwidth this week to take it over the finish line (now that I got the huge refactor out of the way). |
I totally agree but that's a little too hard with the current dialer, it will need a whole bunch of refactorings... which is what we are doing in dialer v2. |
How urgently do we need this? Can it wait 1 week for a cleaner solution (dialer v2)? |
We can wait a week, there is no imminent ipfs release. But I fear v2 won't be ready by then either :p |
And also, I don't want to rush you on this. We can live with the patch for a while. |
Yeah, it's had a rocky history, due to my focus being all over the place :-( But it's clear we have increased pressure and impetus on it now, so I can commit to delivering it within a week. |
^^ may be reckless to give a concrete 1 week horizon. Let's say 2. |
ok, this patch can stay in the queue for a week or two. If it turns out v2 is not gonna be ready for a while yet longer, we can merge and go ahead. Also, please take into account the new logic about relay dialing: We want to prioritize connected relays and aggregate their addrs. |
Hey @raulk If it ease out some of your burden, you can commit the intermediate code to some branch along with the TODO's you think needs to be done for the master branch. Contributors can help you achieve those tasks. I am interested to work on the dialer v2 logic and would be happy to contribute. |
@upperwal thanks for the offer, very much appreciated. Given the nature of the change, it’s best if executed by a single individual. I suggest kicking off a discussion in the forums for the community to propose projects where help is needed: https://discuss.libp2p.io |
@raulk 👍 |
@vyzo should I review this? |
@Stebalien I would say take a look for the scheduling logic, as we want it to carry over to dialer v2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the logic here but there has to be a more general way to do this.
- Transport preference. When constructing libp2p, users could specify some kind of policy for dialing with various transports: ordering, delays, etc.
- Batching. I wonder if the transport interface needs to be changed to take a set of multiaddrs.
A less intrusive alternative would be to allow transports to inject their own "planner" modules that take in an existing plan and updated it (somehow).
if err != nil { | ||
// uspecfic relay addr; push it to immediate -- note that these addrs should not be | ||
// advertised any more, but we don't want to break older nodes or tests | ||
immediateAddrs = append(immediateAddrs, a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we just drop this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, I really don't know what to do with these!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should schedule them for last, after all specific relay addrs have been scheduled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really gain anything? This literally means: dial me through random relays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not.
Only concern is that we do "support" users requesting dials with ipfs swarm connect /p2p-circuit/ipfs/QmTarget
, and we may not want to completely break that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah. What if we just hack that into a higher layer? Or just drop it? Nobody should be relying on that in production and I don't feel all that bad about breaking demos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should drop it -- but we might want to disable circuit.OptDiscovery
in ipfs as well.
We can substitute it with /ipfs/QmRelay/p2p-circuit/ipfs/QmTarget
anyway.
We have a problem with dialing relay addrs: we dial all relays in the address set simultianeously and end up being connected to all of them.
This unnecessary increases the load in relays, and we end up initiating a bunch of parallel dials to all relay addrs. Couple that with addrsplosion, and we make a mess trying to dial relays.
This patch implements a dial scheduler that schedules relay addrs accordingly:
/p2p/QmRelay/p2p-circuit
and issued serially, with a 1.5s delay between them.TBD: tests.