-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Lighter unique Host#domain lookup for in MiqProvisionVirtWorkflow #17403
[WIP] Lighter unique Host#domain lookup for in MiqProvisionVirtWorkflow #17403
Conversation
This query is meant to be a query does the work of `Host#domain` in the query it self, and return it as a column, along with the `id` column as well. The advantage of this is it is a much faster way of collecting the domains for a given host, and doesn't require nearly as many attributes to be returned from the DB, creating object allocations we just don't use (`hosts` has over 30+ columns on it). It also handles filtering uniq domains in the DB by using SELECT DISTINCT ON (domain) ... In the query, but this has a disadvantage in which it `DISTINCT ON` doesn't play nicely with `ActiveRecord#select` (hence the `gsub`ing).
Makes use of `Host.all_unique_downcased_domains` in the `MiqProvisionVirtWorkflow#allowed_domains` method to return only a list of the domain objects, pre-formatted and reduced to unique entries in SQL, reducing object allocations by reducing what is returned from the DB, the number of unneeded rows that needed to be type casted in ruby, and the temp objects generated to build the `domain` string in ruby. The `includes` option of that method is also taken advantage of if the `option[:platform]` is set, though, to me it seems to not make sense since it seems like you could have multiple hosts on the same "domain", but have different platforms for each, so this might return different results when filtered on depending on the order and records in the DB (seems like a bug). Metrics ------- Benchmarks were taken by monitoring the UI requests using the `manageiq_performance` gem. The route monitored was selecting the `"Lifecycle" -> "Publish this VM to a Template"` menu button from a VM show page. The metrics shown are against database with a fairly large EMS, with about 600 hosts, and are 3 subsequent requests against that route. **Before** | ms | queries | query (ms) | rows | | ---: | ---: | ---: | ---: | | 18613 | 2062 | 1463.7 | 70017 | | 17695 | 2062 | 1475.5 | 70017 | | 17774 | 2062 | 1578.4 | 70017 | **After** | ms | queries | query (ms) | rows | | ---: | ---: | ---: | ---: | | 19377 | 2062 | 1401.6 | 66781 | | 17789 | 2062 | 1450.4 | 66781 | | 17874 | 2062 | 1615.5 | 66781 |
Checked commits NickLaMuro/manageiq@e696542~...bdc70ac with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0 app/models/host.rb
app/models/miq_provision_virt_workflow.rb
|
@miq-bot assign @gmcculloug @gmcculloug I realized you have been farming these out to some of the members of the automate team, but this comment in particular has me concerned, and I think you possibly had a hand in writing the original code. Fine if you want to pass this off, but would like your input on if my thoughts on the |
select_rows = select("lower(regexp_replace(split_part(hostname, ',', 1), '[^\.]*\.', '')) as domain") | ||
.select(:id).to_sql | ||
.sub!("SELECT", "DISTINCT ON (domain)") | ||
.sub!(/ FROM.*$/, '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the other day while browsing the arel
codebase (like you do), there is a distinct_on
method I might be able to leverage here. Will look into doing that at some point.
This pull request has been automatically closed because it has not been updated for at least 6 months. Feel free to reopen this pull request if these changes are still valid. Thank you for all your contributions! |
Makes use of
Host.all_unique_downcased_domains
(new method) in theMiqProvisionVirtWorkflow#allowed_domains
method to return only a list of the domain objects, pre-formatted and reduced to unique entries in SQL, reducing object allocations by reducing what is returned from the DB, the number of unneeded rows that needed to be type casted in ruby, and the temp objects generated to build thedomain
string in ruby.The
includes
option of that method is also taken advantage of if theoption[:platform]
is set, though, to me it seems to not make sense since it seems like you could have multiple hosts on the same "domain", but have different platforms for each, so this might return different results when filtered on depending on the order and records in the DB (seems like a bug).About
Host.all_unique_downcased_domains
This query is meant to be a query does the work of
Host#domain
in the query it self, and return it as a column, along with theid
column as well.The advantage of this is it is a much faster way of collecting the domains for a given host, and doesn't require nearly as many attributes to be returned from the DB, creating object allocations we just don't use (
hosts
has over 30+ columns on it).It also handles filtering uniq domains in the DB by using
SELECT DISTINCT ON (domain) ...
in the query, but this has a disadvantage in which itDISTINCT ON
doesn'tplay nicely with
ActiveRecord#select
(hence thegsub
ing).Metrics
Benchmarks were taken by monitoring the UI requests using the
manageiq_performance
gem. The route monitored was selecting the"Lifecycle" -> "Publish this VM to a Template"
menu button from a VM show page. The metrics shown are against database with a fairly large EMS, with about 600 hosts, and are 3 subsequent requests against that route.Note the change in
row
counts:Before
After
Note: The benchmarks for this change do NOT include the changes from other PRs (#17354 for example). Benchmarks of all changes can be found here.
Links