Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git merge support can lead to long index times #3367

Closed
cesar-d2l opened this issue Nov 11, 2020 · 5 comments
Closed

Git merge support can lead to long index times #3367

cesar-d2l opened this issue Nov 11, 2020 · 5 comments
Labels

Comments

@cesar-d2l
Copy link

Describe the bug
OpenGrok version: 1.5.7
OpenJDK version: Amazon Corretto 11
OS: Amazon Linux 2
ctags version: p5.9.20201108.0
Tomcat version: 9.0.39

We are updating our OpenGrok 1.3.16 instance to the latest (1.5.7 as of this writing) but have noticed that the initial time to index jumped from roughly a half-hour to 4.5 hours.

It spends a lot of time in this command: git log --abbrev-commit --abbrev=8 --name-only --pretty=fuller --date=iso8601-strict -m -- path/to/repo

Reading up on the updates, we noticed that OpenGrok now supports octopus merges, fixed by #3166. We think this is causing problems with our repositories branching and merging strategy. This repository doesn't use linear history (using either squash or rebasing) and most people do not rebase before merging, so there is a lot of merge commits - both to master, and from master to branches. (to make matters more complicated, this repository was created from dozens of other smaller repos in a huge history rewrite + merge commit)

Once the initial indexing is done, viewing the history of a file is now a lot more verbose. When I viewed the history from 1 file, it went from showing 2 commits to it being 12 pages long.

To Reproduce
The repository is private, so I can't offer an exact way to reproduce. But it's a fairly large repo (>300,000 commits) with lots of merge commits

Expected behavior
Two things:

  • Initial indexing when there is a lot of merge commits to not take a long time
  • The history to be an accurate reflection of what's changed for a file
@vladak
Copy link
Member

vladak commented Nov 12, 2020

Also, you might be suffering from the increased memory consumption, see #3243.

@vladak vladak added the indexer label Nov 12, 2020
@cesar-d2l
Copy link
Author

I tried using a larger EC2 instance to double the memory, but I didn't see any noticeable improvement. It's also possible that memory requirements are far higher with that change.

I'm not honestly sure we can fix this without resorting to either a) breaking octopus merge commits or b) putting this behind a flag. Neither of which seem like great options.

It may also not be impacting very many people 😄

@cesar-d2l
Copy link
Author

We updated our OpenGrok instance to 1.6.5 today and I can confirm the issue is resolved. Thanks for your help Vladimir!

@vladak
Copy link
Member

vladak commented Mar 26, 2021

I assume you disabled the merge commits by the tunable.

@cesar-d2l
Copy link
Author

yeah I did. I added it to our readonly configuration

vladak pushed a commit that referenced this issue Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants