Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/iter branch and bound #30

Merged

Conversation

yancyribbens
Copy link
Collaborator

Implement Branch and Bound search.

@yancyribbens yancyribbens force-pushed the feature/iter-branch-and-bound branch 4 times, most recently from 45ad1b1 to e153843 Compare March 6, 2024 10:56
@yancyribbens
Copy link
Collaborator Author

No new code changes with the last push, just re-arranging commits and touchup docs. Considering this a candidate to merge soonish.

@yancyribbens yancyribbens force-pushed the feature/iter-branch-and-bound branch from e153843 to bd8dfec Compare March 7, 2024 10:01
@yancyribbens
Copy link
Collaborator Author

Fixed a few more typos in the comments, no code changes.

@murchandamus
Copy link

Starting my review

Copy link

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better!

@murchandamus I looked over your checklist of additional tests to add as well as the questions you posed. I think your question of what will cause BnB to fail was the most interesting.. Please see bellow.

What happens if you have multiple UTXOs that have only slightly different values?

select_coins_bnb_set_size_five() and select_coins_bnb_set_size_seven() both have slightly different values

What I meant to point out here was that if you have a UTXO pool that has many UTXOs with slightly different effective values, e.g.: the 200 UTXOs {0.00100000, 0.00100001, 0.00100002, 0.00100003, …, 0.00100198, 0.00100199}, you can quickly exhaust your iteration limit, if a combination of multiple of these is close to the target (here for example 0.005). But you got that covered now with the iteration limit tests!

What happens if you have many UTXOs?

select_coins_bnb_set_size_seven() has many UTXOs.

Regular end-user wallets easily venture into the territory of dozens of UTXOs, especially since DCA for small amounts has become so popular. Enterprise wallets can easily accumulate thousands of UTXOs. The biggest wallet I’ve seen had over 550'000 UTXOs. When I recommend testing a wallet with many UTXOs I was thinking of something like ~1000 UTXOs, not 7. In a follow-up PR, you could perhaps think about how you may explore a search tree more quickly that has many duplicates.

What happens if there are many alternative solutions that are not equivalent, when the first is not the best?

This is exemplified by test more_inputs_when_cheap. The test sets the current fee_rate to be 10 sat per kwu and the long term fee rate to be 20 sat per kwu. Therefore, the more inputs that are selected the better. A target of 6 is created and a utxo pool [1, 2, 3, 4]. The search routine first finds solution [2, 4] which is not the best solution. Since it is cheaper now (long_term_fee is higher) each utxo that is added reduces the waste. Therefor, the next solution that is found [1, 2, 3] has a lower waste metric and is found second.

Yup, good! This is also covered with select_coins_bnb_set_size_seven() now.

Instead of testing with a feerate of 0, make a helper function that allows you to create UTXOs per their effective value, then run your tests at high and low feerates. Adjust the expected results accordingly.

select_coins_bnb_consume_more_inputs_when_cheap() and select_coins_bnb_consume_less_inputs_when_expensive() both test the behavior with different combinations of long_term and short_term feerate. Not to ignore your suggestion about a testing helper function that allows the creating of UTXOs at different effective values, although I was thinking of restructuring BnB so that it takes a list of effective_values instead of a list of UTXOs. That way, in lib, we would compute the effective_value once and pass the results to either BnB, SRD or whatever other search function is needed. However, I wasn't planning to do that as part of this PR. I don't know how that might change your suggestion to add a helper that calculates effective_value.

That’s okay, you could do it in a follow-up. My general idea would be that your UTXOs should have a weight, and when you pass a feerate to your function, it automatically calculates the corresponding fee. That way it would be easy to create a UTXO that has a given effective value, or a UTXO that has a given amount but still a fee.

Add tests in which BnB fails to find a solution or fails to find the best solution. When does this happen and why?

select_coins_bnb_exhaust fails to find a solution because the iteration limit occurs before a solution is found. //…snip…//

Looks good!

Test what happens when you run into the iteration limit (i.e. you should have cases in which you find no solution, not the best solution)

see select_coins_bnb_exhaust and select_coins_bnb_exhaust_v2

Yeah, they both do not find a solution, but what happens if you find a solution before running into the iteration limit, but you are not done searching?

Optionally make a test that generates a random UTXO pool and target=
where you compare your outcome with the result of a:

I still have this as a TODO. I'm not sure if I'll include it as part of this PR or a followup.

Sure that’s not the easiest. You could perhaps do something like this Knapsack test I wrote to generate a diverse UTXO set and then randomly pick seven UTXOs, sum them up, and use that as target for a BnB search. The test would then assert that you should find a solution, and that this solution has a waste that is equal or lower than if you had used the input set you randomly picked.

I would also suggest that you write a test in which a solution with excess and fewer inputs is preferred over a solution with more inputs at high feerates.

src/branch_and_bound.rs Outdated Show resolved Hide resolved
src/branch_and_bound.rs Outdated Show resolved Hide resolved
src/branch_and_bound.rs Show resolved Hide resolved
src/branch_and_bound.rs Outdated Show resolved Hide resolved
src/branch_and_bound.rs Show resolved Hide resolved
@yancyribbens yancyribbens force-pushed the feature/iter-branch-and-bound branch 3 times, most recently from efa4668 to 0114f2a Compare March 8, 2024 13:56
@yancyribbens
Copy link
Collaborator Author

@murchandamus

I would also suggest that you write a test in which a solution with excess and fewer inputs is preferred over a solution with more inputs at high feerates.

Good idea. I added a test here: 92c4e6b

Yeah, they both do not find a solution, but what happens if you find a solution before running into the iteration limit, but you are not done searching?

Good question. I added a test to show what happens here: 0114f2a

I've made note for the next PR along with some other refactors to

  • use reasonable fee_rate and effective value
  • consolidate common tests with a helper function
  • keep at least on test with a zero feerate.
  • optionally make a test that generates a random UTXO pool and target

I think the most critical comments have been addressed. If the fixups all look good I'll squash and merge. Thanks for all the feedback!

@yancyribbens
Copy link
Collaborator Author

As a side note, it would be nice maybe to return the iterations in some way which would make all of the iteration testing more meaningful.

Copy link

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm looks good to me now and the test coverage seems to be fairly complete.

What I haven’t looked into, and remains an open question to me is how the algorithm integrates with the greater code base. The tests set magic numbers for feerate, long term feerate estimate, fees, and cost_of_change. How is select_coins_bnb called from the transaction building? How do the UTXOs get preprocessed? How is the algorithm integrated into the transaction sending, and how do you ensure that the UTXOs are passed with correct and complete values for the call? When transaction building gets back an input set that does not create change, will it build the transaction correctly, skip the change output and drop the excess to the fees?

I’m not sure whether that’s in scope for this PR, but you may want to scan the PR one more time with the interface in mind, and perhaps consider adding some form of integration test that includes having a wallet set up with UTXOs and a transaction being built from there at a specific feerate that results in a changeless transaction that used BnB under the hood.

src/branch_and_bound.rs Show resolved Hide resolved
@murchandamus
Copy link

As a side note, it would be nice maybe to return the iterations in some way which would make all of the iteration testing more meaningful.

In CoinGrinder I return the iteration count in the selection result object for exactly that reason.

@yancyribbens
Copy link
Collaborator Author

Enterprise wallets can easily accumulate thousands of UTXOs. The biggest wallet I’ve seen had over 550'000 UTXOs. When I recommend testing a wallet with many UTXOs I was thinking of something like ~1000

A few more TODOs I forgot to mention earlier for the next iteration:

  • Add testing for much larger sets.
  • In a follow-up PR, you could perhaps think about how you may explore a search tree more quickly that has many duplicates.

@yancyribbens
Copy link
Collaborator Author

The algorithm looks good to me now and the test coverage seems to be fairly complete.

Thanks!

What I haven’t looked into, and remains an open question to me is how the algorithm integrates with the greater code base. The tests set magic numbers for feerate, long term feerate estimate, fees, and cost_of_change. How is select_coins_bnb called from the transaction building? How do the UTXOs get preprocessed? How is the algorithm integrated into the transaction sending, and how do you ensure that the UTXOs are passed with correct and complete values for the call? When transaction building gets back an input set that does not create change, will it build the transaction correctly, skip the change output and drop the excess to the fees?

I've been looking into this, although I'm still exploring these questions. @Tibo-lg can speak more to this than I can, but Rust-DLC has some simple wallet interface that combines coin-selection with electrs to create wallet functionality of some fashion. My intention with the project is to include this as a rust-bitcoin crate to make it available to anyway as a basic building block for wallet construction. I've been meaning to try out electrs and interface that with some simple wallet software as well. Anyway, I'm not sure all the mechanics of RPC to facilitate transaction creation/sending etc yet, although I'm looking forward to exploring that more next :)

I’m not sure whether that’s in scope for this PR, but you may want to scan the PR one more time with the interface in mind, and perhaps consider adding some form of integration test that includes having a wallet set up with UTXOs and a transaction being built from there at a specific feerate that results in a changeless transaction that used BnB under the hood.

I'm not satisfied with the API yet enough to mark this as complete and ready for use, although I'm satisfied enough to merge this PR shortly. There's still some ongoing discussion about the interface and getting a review to bring this into rust-bitcoin. Also, there's still work to be done to add fuzz testing which I think is important to make sure it's a high quality implementation.

@murchandamus
Copy link

I'm not satisfied with the API yet enough to mark this as complete and ready for use, although I'm satisfied enough to merge this PR shortly. There's still some ongoing discussion about the interface and getting a review to bring this into rust-bitcoin. Also, there's still work to be done to add fuzz testing which I think is important to make sure it's a high quality implementation.

If you are going to make a crate that is called by completely unrelated software, you would probably want to put more verification of your assumptions into your code. E.g. you may want to ensure that every UTXO has a reasonable weight and/or fee beside the amount, if fee and weight are both present that they match with the feerate, etc.

@yancyribbens
Copy link
Collaborator Author

yancyribbens commented Mar 9, 2024

If you are going to make a crate that is called by completely unrelated software, you would probably want to put more verification of your assumptions into your code. E.g. you may want to ensure that every UTXO has a reasonable weight and/or fee beside the amount, if fee and weight are both present that they match with the feerate, etc.

I'm not sure if you've seen InputWeightPredict in Rust Bitcoin. I was thinking of next changing the API to accept a list of WeightPredictions as well as values. That way, every UTXO can have an associated weight prediction instead of requiring the consumer to calculate the satisfaction weight..

@yancyribbens yancyribbens merged commit e4ce100 into p2pderivatives:master Mar 11, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants