Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn on performance_schema for apiary meta store #84

Open
RongQiao opened this issue Nov 7, 2018 · 4 comments
Open

Turn on performance_schema for apiary meta store #84

RongQiao opened this issue Nov 7, 2018 · 4 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@RongQiao
Copy link

RongQiao commented Nov 7, 2018

I am from Expedia Data Engineering team. We have some AWS mysql RDS served as Hive Meta store, and we suffered some metastore performance issues in the past when some users run 'alter table recover partitions' etc on big dataset. Without the performance_schema, we don't have much insights about what's going on. The apiary metastore has more complicated use cases, so I would suggest that the performance_schema is available for apiary meta store.

@mroark1m mroark1m added enhancement New feature or request good first issue Good for newcomers labels Nov 8, 2018
@massdosage
Copy link
Contributor

@rpoluri do you understand the request here? Is this some part of the underlying Hive metastore DB schema that for some reason we haven't activated?

@mroark1m
Copy link
Contributor

It's an extra component of mysql (seems like also in aurora https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/) that gives you more stats on who and what is abusing the database when you have performance problems. https://dev.mysql.com/doc/refman/8.0/en/performance-schema.html
It's one of those things you don't need until the database is breaking, but I think it also has some nontrivial performance impact itself too.

@massdosage
Copy link
Contributor

Wouldn't the owners of the RDS be able to set that up themselves then? i.e. it doesn't need to be part of Apiary? Feels more like a MySQL/Aurora DB admin task?

@rpoluri
Copy link
Contributor

rpoluri commented Feb 14, 2020

We manage RDS part of Apiary Data Lake.
May be this is corresponding option in RDS, https://www.terraform.io/docs/providers/aws/r/rds_cluster_instance.html#performance_insights_enabled
will check and close issue accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants