Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRSS query efficiency #839

Closed
llxia opened this issue Feb 6, 2024 · 6 comments
Closed

TRSS query efficiency #839

llxia opened this issue Feb 6, 2024 · 6 comments

Comments

@llxia
Copy link
Contributor

llxia commented Feb 6, 2024

As we monitor more and more test builds, we need to look into TRSS query efficiency. I have seen cases where TRSS uses 100%+ to 600% CPU when loading the page.

image

Also, depending on the number of builds that are monitored, loading the main page can take a long time.

@llxia
Copy link
Contributor Author

llxia commented Feb 23, 2024

A couple of thoughts:

  • lazy load. Load the page as the user scrolls. For example, https://github.com/kingRayhan/reactjs-visibility can be used
  • combine queries. Since the maximum parallel connections in Chrome are 6 connections per domain, we should try to combine queries. For example, array can be used.
  • Promise.all()
  • taps. Taps can be used as way to reduce the queries on the home page
  • index DB. index DB to get better server response

@sxa
Copy link
Member

sxa commented Apr 5, 2024

Noting that I have mitigated this on the Adioptium TRSS server by rate-limiting requests on the nginx front-end, but that should be considered a temporary workaround for the underlying issues with TRSS.

A change in architecture to use a single query would definitely be preferable if possible, or at least combining them somehow so as not to overload the database.
Ref: adoptium/infrastructure#3354

@llxia
Copy link
Contributor Author

llxia commented Apr 5, 2024

This is not a database overload issue. All changes are delivered. Performance has been boosted by approximately 35 times. This issue will be closed.

Rate-limiting requests on nginx is not a way to fix performance issue.
Rate-limiting requests on nginx restricts the number of requests a client can make to the server within a specified time period. This is good for mitigating issues such as brute-force attacks, but it could also block legitimate users or API calls if the limit is set too low.
It requires careful tuning and monitoring to ensure that legitimate traffic is not inadvertently blocked.
If you have a specific problem, please open an new issue.

@llxia llxia closed this as completed Apr 5, 2024
@sxa
Copy link
Member

sxa commented Apr 5, 2024

This is not a database overload issue. All changes are delivered

Does that mean the problem that you've screenshotted in the original description has been resolved and we just need to get the update onto the adoptium TRSS instance?

Rate-limiting requests on nginx is not a way to fix performance issue.

I completely agree but I wasn't aware that anyone had been working on the issue - I'd be delighted if the performance issue has been fixed and I can remove the limit again :-)

@smlambert
Copy link
Contributor

Perhaps I failed to describe clearly enough in recent scrum or Slack that my intention/priority is to update the synch job (#856) so I can pull in the 3 recent perf improvements committed into aqa-test-tools from Lan.

I am working on it now, but took longer than expected due to recent removal of local Docker tools, and my wanting to test locally. I've finally resolved that barrier and will hopefully be able to test my updates shortly.

Noting we had 2 different issues:

  1. TRSS perf
  2. MongoDB container bloat

Lan has vastly improved 1) TRSS perf, but we have not pulled the changes in to our prod server yet.
For 2) I am not certain I understand that bloat, but believe that regularly running the synch job will help, and adding a step in the synch job to cleanup stuff if needed is certainly possible.

@sxa
Copy link
Member

sxa commented Apr 5, 2024

  1. TRSS perf, but we have not pulled the changes in to our prod server yet.

Thanks - I knew you were working on getting the sync job working again but I wasn't aware until now that it was because some of the underlying issues we'd been seeing here - that had been mitigated temporarily with the nginx "hack" - had been resolved. That's great to hear to thanks Lan!

I think for (2) we still need to understand what can be done to reduce the output (although that's separate from this issue). It would be good to know if other TRSS instances were seeing this with a default configuration to indicate if it's something we've done. A cleanup on sync might be adequate but is more of a sticking plaster (Similar to what I did with nginx!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants