Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Lighter unique Host#domain lookup for in MiqProvisionVirtWorkflow #17403

Conversation

NickLaMuro
Copy link
Member

@NickLaMuro NickLaMuro commented May 9, 2018

Makes use of Host.all_unique_downcased_domains (new method) in the MiqProvisionVirtWorkflow#allowed_domains method to return only a list of the domain objects, pre-formatted and reduced to unique entries in SQL, reducing object allocations by reducing what is returned from the DB, the number of unneeded rows that needed to be type casted in ruby, and the temp objects generated to build the domain string in ruby.

The includes option of that method is also taken advantage of if the option[:platform] is set, though, to me it seems to not make sense since it seems like you could have multiple hosts on the same "domain", but have different platforms for each, so this might return different results when filtered on depending on the order and records in the DB (seems like a bug).

About Host.all_unique_downcased_domains

This query is meant to be a query does the work of Host#domain in the query it self, and return it as a column, along with the id column as well.

The advantage of this is it is a much faster way of collecting the domains for a given host, and doesn't require nearly as many attributes to be returned from the DB, creating object allocations we just don't use (hosts has over 30+ columns on it).

It also handles filtering uniq domains in the DB by using SELECT DISTINCT ON (domain) ... in the query, but this has a disadvantage in which it DISTINCT ON doesn't
play nicely with ActiveRecord#select (hence the gsubing).

Metrics

Benchmarks were taken by monitoring the UI requests using the manageiq_performance gem. The route monitored was selecting the "Lifecycle" -> "Publish this VM to a Template" menu button from a VM show page. The metrics shown are against database with a fairly large EMS, with about 600 hosts, and are 3 subsequent requests against that route.

Note the change in row counts:

Before

ms queries query (ms) rows
18613 2062 1463.7 70017
17695 2062 1475.5 70017
17774 2062 1578.4 70017

After

ms queries query (ms) rows
19377 2062 1401.6 66781
17789 2062 1450.4 66781
17874 2062 1615.5 66781

Note: The benchmarks for this change do NOT include the changes from other PRs (#17354 for example). Benchmarks of all changes can be found here.

Links

This query is meant to be a query does the work of `Host#domain` in the
query it self, and return it as a column, along with the `id` column as
well.

The advantage of this is it is a much faster way of collecting the
domains for a given host, and doesn't require nearly as many attributes
to be returned from the DB, creating object allocations we just don't
use (`hosts` has over 30+ columns on it).

It also handles filtering uniq domains in the DB by using

  SELECT DISTINCT ON (domain) ...

In the query, but this has a disadvantage in which it `DISTINCT ON`
doesn't play nicely with `ActiveRecord#select` (hence the `gsub`ing).
Makes use of `Host.all_unique_downcased_domains` in the
`MiqProvisionVirtWorkflow#allowed_domains` method to return only a list
of the domain objects, pre-formatted and reduced to unique entries in
SQL, reducing object allocations by reducing what is returned from the
DB, the number of unneeded rows that needed to be type casted in ruby,
and the temp objects generated to build the `domain` string in ruby.

The `includes` option of that method is also taken advantage of if the
`option[:platform]` is set, though, to me it seems to not make sense
since it seems like you could have multiple hosts on the same "domain",
but have different platforms for each, so this might return different
results when filtered on depending on the order and records in the DB
(seems like a bug).

Metrics
-------

Benchmarks were taken by monitoring the UI requests using the
`manageiq_performance` gem.  The route monitored was selecting the
`"Lifecycle" -> "Publish this VM to a Template"` menu button from a VM
show page.  The metrics shown are against database with a fairly large
EMS, with about 600 hosts, and are 3 subsequent requests against that
route.

**Before**

|    ms | queries | query (ms) |  rows |
|  ---: |    ---: |       ---: |  ---: |
| 18613 |    2062 |     1463.7 | 70017 |
| 17695 |    2062 |     1475.5 | 70017 |
| 17774 |    2062 |     1578.4 | 70017 |

**After**

|    ms | queries | query (ms) |  rows |
|  ---: |    ---: |       ---: |  ---: |
| 19377 |    2062 |     1401.6 | 66781 |
| 17789 |    2062 |     1450.4 | 66781 |
| 17874 |    2062 |     1615.5 | 66781 |
@miq-bot
Copy link
Member

miq-bot commented May 9, 2018

Checked commits NickLaMuro/manageiq@e696542~...bdc70ac with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
2 files checked, 4 offenses detected

app/models/host.rb

app/models/miq_provision_virt_workflow.rb

@NickLaMuro
Copy link
Member Author

@miq-bot assign @gmcculloug

@gmcculloug I realized you have been farming these out to some of the members of the automate team, but this comment in particular has me concerned, and I think you possibly had a hand in writing the original code. Fine if you want to pass this off, but would like your input on if my thoughts on the FIXME were correct or misguided by something I am not accounting for. Thanks.

select_rows = select("lower(regexp_replace(split_part(hostname, ',', 1), '[^\.]*\.', '')) as domain")
.select(:id).to_sql
.sub!("SELECT", "DISTINCT ON (domain)")
.sub!(/ FROM.*$/, '')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the other day while browsing the arel codebase (like you do), there is a distinct_on method I might be able to leverage here. Will look into doing that at some point.

@miq-bot
Copy link
Member

miq-bot commented Dec 3, 2018

This pull request has been automatically closed because it has not been updated for at least 6 months.

Feel free to reopen this pull request if these changes are still valid.

Thank you for all your contributions!

@miq-bot miq-bot closed this Dec 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants