Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: persist ExpectedUtxo to disk (db) #175

Merged
merged 3 commits into from
Aug 22, 2024
Merged

Conversation

dan-da
Copy link
Collaborator

@dan-da dan-da commented Aug 19, 2024

closes #172, #137.

fixes loss-of-funds scenario.

The primary change here is to move storage of expected_utxos from UtxoNotificationPool into RustyWalletDatabase. Thus expected_utxos now exists as a DbtVec in the same wallet database as monitored_utxos. UtxoNotificationPool is removed entirely. Also pruning of stale expected utxos has been temporarily disabled.

Commentary:

  1. storing expected_utxos in the same DB as monitored_utxos causes DB writes to be atomic across both data sets which is desirable. Also, there is now a nice symmetry for both these utxo data sets.
  2. An alternate design would be to use a separate levelDB file for expected_utxos. This would enable deletion of individual entries, but we would lose transactional guarantee with other items in the wallet db.
  3. UtxoNotificationPool had code related to PeerNotification that was not actually being used. I believe this may have been implemented to support planned future functionality, but I think it best to remove it now, until such time as something like that is needed.
  4. In my judgement, it is dangerous to ever expire expected utxos as this could result in a loss of funds, for example during a deep re-org. I think a thorough analysis should be performed to determine rules for when it is safe, if ever, to expire/prune expected utxos. So until that is done, I have disabled the call to prune, although a method still exists. See my code comments for WalletState::prune_stale_expected_utxos(). Pruning is essentially an optimization, so I think it is safest to launch without it and optimize if/when the time is right.
  5. The prune method itself had to be modified to first clear all entries, then add back non-stale entries. This is because DbtVec does not support removing individual entries, except for the last.
  6. In my view, the code is simpler and cleaner now with UtxoNotificationPool gone. If this change is controversial, let me know.

changes:

  • add RustyWalletDatabase::expected_utxos (DbtVec)
  • remove UtxoNotificationPool
  • remove UtxoNotifier::PeerUnsigned
  • add WalletState methods: add_expected_utxo(), scan_for_expected_utxos()
  • disables timer call to prune_stale_expected_utxos() for now.
  • remove cli args: max_utxo_notification_size, max_unconfirmed_utxo_notification_count_per_peer
  • mod utxo_notification_pool --> expected_utxo
  • derive Hash for Timestamp

tests:

  • adapt existing tests to changes
  • move tests from utxo_notification_pool into wallet_state
  • adds regression tests for issue ExpectedUtxo not persisted, can lead to loss of funds. #172 that verify:
    1. expected_utxo are persisted if db is written to disk.
    2. expected_utxo are not persisted if db is not written to disk.

@dan-da dan-da force-pushed the 172_persist_expected_utxo_pr branch from 4abd0b1 to a6eb73b Compare August 19, 2024 02:58
closes #172, #137.

fixes loss-of-funds scenario.

changes:
 * add RustyWalletDatabase::expected_utxos (DbtVec)
 * remove UtxoNotificationPool
 * remove UtxoNotifier::PeerUnsigned
 * add WalletState methods:
    add_expected_utxo(), scan_for_expected_utxos()
 * disables timer call to prune_stale_expected_utxos() for now.
 + remove cli args:
    max_utxo_notification_size,
    max_unconfirmed_utxo_notification_count_per_peer
 * mod utxo_notification_pool --> expected_utxo
 * derive Hash for Timestamp

tests:
 * adapt existing tests to changes
 * move tests from utxo_notification_pool into wallet_state
 * adds regression tests for issue #172 that verify:
    1.  expected_utxo are persisted if db is written to disk.
    2.  expected_utxo are not persisted if db is not written to disk.
Copy link
Contributor

@aszepieniec aszepieniec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Minor, mostly superficial suggestions.

src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
src/models/state/wallet/expected_utxo.rs Outdated Show resolved Hide resolved
* addresses suggestions in PR #175.
* fixes `cargo doc` warnings
@dan-da
Copy link
Collaborator Author

dan-da commented Aug 20, 2024

ok, I addressed all the suggestions in 3c049fb. thx @aszepieniec for the review!

Copy link
Member

@Sword-Smith Sword-Smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a simplification to use the database for storing expected UTXOs instead of using a mempool-like data structure that I originally implemented this as. With the mempool-like structure you'd have to have two different sets of logic for persisting and handling the UTXO notification pool in RAM.

Some functionality that's lost, though, is that the UtxoNotifier can no longer be a peer. And, when the UtxoNotifier can be a peer, it makes sense to have a "credibility" value attached to ExpectedUtxo. And then it makes sense to have some concept of ordering of the UTXO notifications. Or, at least to have some kind of DOS protection to prevent a peer from using all your available disk and RAM for UTXO notifications.

The transaction notifications that you can put on-chain through public announcements both cost transaction fees and they leak information. If we are to have a way of transacting that avoids these two downsides, I'd say it makes sense to allow the protocol to share UTXO notifications somehow, a notification being the (Utxo, Digest) tuple, representing the UTXO and the sender randomness. We'd certainly want to encrypt this information if 3rd parties see it.

You don't have two answer this latter question now, about how to share UTXO notifications through neptune-core without those notifications showing up in the block. But the ExpectedUtxo data structure/the wallet state should somehow support this use case.

@dan-da
Copy link
Collaborator Author

dan-da commented Aug 20, 2024

I will try to illustrate my perspective / analysis:

The concept of sharing ExpectedUtxo paid to 3rd parties "off-chain" but via the p2p protocol is an interesting and novel solution worth exploring. It has the nice property that no external transfer method is needed.

At the same time, it comes with some heavy responsibilities and risks vs local-state only ExpectedUtxos (paid to self). In particular there is a need to avoid/mitigate DOS attacks and a need to limit the max size of the pool which requires expiring entries. Unfortunately expiring entries also has a potential for losing funds... if all nodes expire a given ExpectedUtxo before it is claimed (converted to a monitored utxo) then the recipient can never claim it. This seems to me a serious issue.

External off-chain transfer methods (out-of-band) also risk funds loss. A possible distinction though is that neptune-core cannot really be blamed when that occurs. It seems neptune-core would carry more responsibility for ExpectedUtxo that it propagates between parties itself via p2p.

This may be a solveable problem. I believe though it will require some careful analysis of loss-of-funds scenarios and the guarantees we make (or don't make) to senders and recipients. The existing prune_stale_expected_utxos() method seems too simplistic and dangerous to me, which is why this PR disables it from being called by a timer. The time based expiration could result in a loss of funds simply by a recipient's node being offline for a month or clock set incorrectly. Deep re-org is another possibility.

These are things that should be considered if/when we go to actually implement the p2p based sharing.

It also seems possible to put the local-state (pay-to-self) ExpectedUtxo into a different storage bucket than (pay-to-other) ExpectedUtxo transferred via p2p. I really don't see any need to ever expire these local-state ExpectedUtxo; it is safest and most conservative not to do so.

So then, my thought is that the storage impl in this PR is simple and adequate for the local-state pay-to-self ExpectedUtxo. At such time in the future that we decide to implement the p2p sharing pool, it could have its own storage for pay-to-other ExpectedUtxo that are (typically) received from peers. That storage would likely need to be in a separate DB if we are still using levelDB and need to support removal of entries. This separation would also create a kind of firewall that prevents local-state ExpectedUtxo from ever being expired.

But the ExpectedUtxo data structure/the wallet state should somehow support this use case.

This PR does not modify the ExpectedUtxo struct itself. It does remove the PeerNotify variant from UtxoNotifier. However that variant could be added back without requiring any change/migration of the wallet database structure.

Bigger picture though: as laid out above, I think it would be cleaner to put the p2p shared ExpectedUtxo in their own storage area/db.

Also, the fundamental reason that the wallet-db cannot support dropping entries is because of our usage of the DbtVec abstraction to create virtual tables in a levelDB database. So I think that is the biggest limiting factor with regards to the impl in this PR in terms of being used for a sharing pool.

aside: If it were my decision alone, I would attempt/prototype switching to redb before mainnet, as it natively supports transactional table namespaces and we could keep lots of things in the same database without need of clever hacks like DbtSchema, DbtVec. But I understand that is off the table for now.

** Summing up **

As I see it, our practical options are basically:

  1. merge this PR now. possibly build a p2p sharing pool later with entries stored separately.

  2. scrap this PR. make a new PR that keeps UtxoNotificationPool but creates a new database file for storing only ExpectedUtxo entries.

I prefer (1) but am willing to do either.

@Sword-Smith @aszepieniec please let me know your thoughts.

@aszepieniec
Copy link
Contributor

@dan-da You mentioned during the standup call on Monday that your next task would be to implement support for off-band UTXO notifications. Could you elaborate on how you expect that to work and how it is supposed to integrate with the changes introduced by this PR (assuming it is merged)? Maybe you and @Sword-Smith aren't that far apart when it comes to concrete next steps.

I'm fine with a CLI-only interface for off-band UTXO notifications as an immediate target. We probably do want to support an automatic peer-to-peer overlay protocol at some point in the future, but that needs to come after an elaborate design process with DOS-resistance built in. That's where I stand now, but that might change based on arguments.

@dan-da
Copy link
Collaborator Author

dan-da commented Aug 21, 2024

You mentioned during the standup call on Monday that your next task would be to implement support for off-band UTXO notifications. Could you elaborate on how you expect that to work and how it is supposed to integrate with the changes introduced by this PR (assuming it is merged)? Maybe you and @Sword-Smith aren't that far apart when it comes to concrete next steps.

Sure. I view out-of-band notifications as being orthogonal / complementary to this PR or to a future p2p shared utxo pool.

name send to on/off chain middle-man
this PR self off-chain neptune-core
p2p pool others off-chain neptune-core
out-of-band others off-chain email, sms, signal, etc

Basically the out-of-band process would look something like:

  1. add UtxoNotification::OffChainSerialized(UtxoTransfer). Where UtxoTransfer is a placeholder name for a serializable struct with same fields that are in the PublicAnnouncement BFieldElements.

  2. modify send_to_many() or add new RPCs such that caller can specify TxOutputList directly, which includes UtxoNotification for each output. Caller can use a helper method on GlobalState to generate these from [(ReceivingAddress, amount)] and map from ReceivingAddress to each UtxoTransfer.

  3. sender then provides the serialized UtxoTransfer(s) to recipient(s) off-chain and out-of-band. (not neptune-core's concern how this occurs)

  4. recipient imports/claims utxos via a new claim_utxos() RPC. The UtxoTransfer is stored permanently in recipient's local state in case blockchain re-scan is ever needed.

  5. ideally there would be a way (RPC call) for sender to regenerate the data in case it is lost in transit somehow. I haven't really given that any thought yet.

@Sword-Smith
Copy link
Member

Sword-Smith commented Aug 21, 2024

For this PR, I see a danger of some unfortunate scaling behavior, specifically in the WalletState::scan_for_expected_utxos method

    pub async fn scan_for_expected_utxos<'a>(
        &'a self,
        transaction: &'a Transaction,
    ) -> impl Iterator<Item = AnnouncedUtxo> + 'a {
        let expected_utxos = self.wallet_db.expected_utxos().get_all().await;

        transaction.kernel.outputs.iter().filter_map(move |a| {
            expected_utxos
                .iter()
                .find(|eu| eu.addition_record == *a)
                .map(|eu| eu.into())
        })
    }

Here, the expected_utxos variable grows indefinitely with the number of blocks that have been mined locally. And the complexity of this function scales as $N \cdot M$, where $N$ is the number of expected UTXOs in the wallet state and $M$ the number of outputs in the block. Having a method that only fetches the not-yet-confirmed expected UTXOs would fix this issue.

@dan-da
Copy link
Collaborator Author

dan-da commented Aug 21, 2024

Here, the expected_utxos variable grows indefinitely with the number of blocks that have been mined locally.

good catch.

yes this is yet another limitation of using DbtVec. We lose the ability to lookup by the AdditionRecord hash, which is what the code in master does.

Having a method that only fetches the not-yet-confirmed expected UTXOs would fix this issue.

In this PR, all ExpectedUtxo are stored in a single DbtVec, as opposed to having separate DbtVec for confirmed vs not-confirmed. I see a few possible ways to be more efficient.

  1. generate an index on the fly. At the start of scan_for_expected_utxos() loop through all ExpectedUtxo and add them to a HashMap indexed by AdditionRecord. This is o(n) with number of expected_utxos to generate the index plus o(m) with number of tx outputs.

  2. add a second DbtVec to the database that contains indexes of the not-yet-confirmed ExpectedUtxo. Then only fetch ExpectedUtxo at these indices. This is o(n*m) with the number of not-confirmed utxos * number of tx outputs.

  3. use a separate leveldb file dedicated to ExpectedUtxo. Then we simply store the ExpectedUtxo with the AdditionRecord hash as the key. This is o(m) with the number of tx outputs.

  4. use a database such as redb that supports table-like namespaces so that we can use AdditionRecord hash as the key but still put them in the wallet DB. This is also o(m) with number of tx outputs.

long term, my preference is (4).

With leveldb:
(1) is easy/fast to impl, but still grows o(n)+o(m). probably "good enough" for mainnet. unless/until wallets have tens of thousands of expected_utxos.
(2) is a bit gross as it requires a 2nd dbtvec in the DB that must be maintained. It is still o(n*m), but the size of n is much reduced.
(3) is the most efficient at o(m), but we lose atomicity with the rest of the wallet db. I haven't thought deeply if this is any problem or not. But if nothing else, it is a possible complication when restoring wallet backups.

I would suggest (1) for now, and consider (4) later.

scan_for_expected_utxos() was o(n*m) where:
 n = number of ExpectedUtxo in database. (all-time)
 m = number of transaction outputs.

this makes it o(n) + o(m) instead.

see: #175 (comment)
@dan-da
Copy link
Collaborator Author

dan-da commented Aug 21, 2024

ok, I impl'd (1) in 3776c78. @Sword-Smith what do you think?

@Sword-Smith
Copy link
Member

Sword-Smith commented Aug 22, 2024

ok, I impl'd (1) in 3776c78. @Sword-Smith what do you think?

Looks good. This illustrates, though, that it's important that we keep logging the time to append a new block to the state, such that we can catch such problems in the future.

@dan-da dan-da merged commit 90aaa94 into master Aug 22, 2024
3 checks passed
dan-da added a commit that referenced this pull request Aug 22, 2024
* addresses suggestions in PR #175.
* fixes `cargo doc` warnings
@Sword-Smith Sword-Smith deleted the 172_persist_expected_utxo_pr branch August 27, 2024 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ExpectedUtxo not persisted, can lead to loss of funds.
3 participants