Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawlers improvements. #924

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Andrei-Dolgolev
Copy link
Contributor

@Andrei-Dolgolev Andrei-Dolgolev commented Sep 20, 2023

Immediate rows for Unverified Data

The Moonworm-crawler no longer waits for confirmations before insert to the database. Both events and transaction calls are saved instantly upon collecting the minimum number of blocks. Under label {label_name}-unverified.

Verification Process

During every crawl iteration, Moonworm-crawler runs a verification check for all events and transaction calls tagged with unverified. It identifies these items and updates their labels after confirming their validity. This update:

  • Changes the label from label_name_unverified to the designated label_name.
  • Eliminates duplicate entries based on tx_hash and log_index.
  • Confirms no pre-existing records under label_name.

Removal of Unverified Tags

Once verified, the crawler removes the label_name_unverified wich already exists under label_name.

Enhanced Blocks Cache for Events

The blocks cache now pulls its data from the database block and labels tables. It updates this data once for each block range.

Block Cache for Transaction Calls

No modifications were needed; however, future changes might require updates in the Moonworm library.

Event and Function Job Deduplication

Both events and functions have transitioned to using abi_selector instead of abi_hash to ensure uniqueness.

Andrey added 8 commits September 20, 2023 13:19
Added initial fix with event deduplication by selector instead of event hash.
Functions call get selector from entry tags.
Refactor blocks cache.
1. Add new label_name_unverified which go to db without confirmatios.
2. Add insert as 1 operation in sql with deduplication.

Blocks cache will include labels_table.
As well state of database will go to blocks cahce once per from/to blocks range.

Add delete from database label with label_name_unverified wich already exists ind database.
@Andrei-Dolgolev
Copy link
Contributor Author

Depends on #919

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant