Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research: Check range type in Postgres and see if it could improve the new validator projection #678

Closed
ysong42 opened this issue Jan 28, 2022 · 4 comments
Assignees

Comments

@ysong42
Copy link
Contributor

ysong42 commented Jan 28, 2022

Task Details

This ticket is a follow-up on #669 .

The new validator projection performance against Crypto.org testnet is fine. But the performance against Crypto.org Mainnet is terrible.

We want to optimize the performance. So in this ticket I will do the following:

  • Study the range type. Apply it on Delegations view.
  • Test it to see if it could significantly improve the projection performance.

Definition of Done

  • Record the finding under this ticket.
  • If the result is positive, then apply the same changes to other views.
@ysong42
Copy link
Contributor Author

ysong42 commented Feb 4, 2022

I implemented the range type for Delegations view. And the sync up speed seems much better.

  • Testnet, I ran it for around 13.5 hours, and it synced up around 1,686,000 blocks in total.
  • Mainnet, I ran it for around 2 hours, and it synced up around 150,000 blocks in total.

@ysong42
Copy link
Contributor Author

ysong42 commented Feb 7, 2022

After implemented the range type for Delegations, Validators, UnbondingDelegations and Redelegations views. The sync up speed did improve a lot.

Testnet

On Testnet, all blocks could be synced up in less than 18 hours (2,836,000 blocks in total)

Mainnet

However on Mainnet, the performance seems getting worse when more data written in to DB.

I ran it for around 13.5 hours, it synced up 588,175 blocks (while we have around 4,400,000 in total).

I did a speed test at around 588,000 height, the indexer sync up speed is around 6.13 blocks/second. If it keeps this speed, it will take 7.12 days to finish the sync up.

But I checked the DB and noticed a pattern that as the projection getting more data, the Msg handling are taking more time.

image

image

image

image

@ysong42
Copy link
Contributor Author

ysong42 commented Feb 8, 2022

Found out the evil SQL.

SQL-1

SELECT COUNT(*)
	FROM public.view_vd_validators
	WHERE operator_address = 'crocncl1u5ryf5jwc2jhd9xyvmasfqzacxp03v8dcj8xry' AND lower(height) = 719959;

SQL-2

SELECT COUNT(*)
	FROM public.view_vd_validators
	WHERE operator_address = 'crocncl1u5ryf5jwc2jhd9xyvmasfqzacxp03v8dcj8xry' AND height = '[719959,)';

SQL-2 is more than 100 times faster than SQL-1.

This SQL is used when updating validators, which is a common operation in mainnet.

@ysong42
Copy link
Contributor Author

ysong42 commented Feb 11, 2022

Final Outcome

Testnet

Sync up to latest height. Run the validation script to do the cross check.

Performance: sync up to latest height within 2 days.

Correctness: Picked 1000 height (from 1 to latest). All validators, delegations, unbonding delegations and redelegations records are matched with Cosmos RPC records.

Mainnet

Sync up to height 4232032. The server stopped there cause we need to disable MsgExec to let parser continue to work.

Performance: sync up to this 4232032 in around 55 hours.

Correctness: Picked 1000 height (from 1 to 4232032). All validators records are matched with Cosmos RPC records.

Conclusion

Now I am pretty confident at the projection's performance and correctess. The following task are to handle:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant