Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] PoC for a Node.JS backend using nodegit #259

Draft
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

julien-c
Copy link

@julien-c julien-c commented Oct 8, 2020

No description provided.

node/app/Context.ts Outdated Show resolved Hide resolved
node/app/Repo.ts Outdated Show resolved Hide resolved
node/README.md Show resolved Hide resolved
if (context.path === undefined) {
commits = allCommits;
} else {
/// CAUTION(this can be pretty long)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple way to speed this up is by not fetching the entire diff, but comparing the parents' trees to the commits' trees (without actually looking at the file contents). Looking at .coveragerc history takes around 30 s for me.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.coveragerc in which repo?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and 30s vs. how long with this current implem?)

Yes I think you make a good point. Will have to experiment and re-implement.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, should've been more specific. I'm talking about huggingface/transformers/commits/master/.coveragerc. It takes around 30 s, vs time git log .coveragerc => 145 ms

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Klaus I implemented it simply by calling git log using subprocess. I used the pure Python implementation linked above but it was very slow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will take a stab at this now (unless you want to do it? :) )

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, go for it... I invested 20 min into coming up with a draft but realised it's going to take me much longer than someone proficient in TypeScript

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't know if what I came up with in Klaus is the best way to do it (but it works)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



app.get( '/:repo/commits', historyCommits);
app.get('/:namespace/:repo/commits', historyCommits);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This URL scheme doesn't allow for a repository called "commits" in a namespace, ie. somenamespace/commits. That's why I decided to use the ~ prefix in Klaus. Maybe there are other ambiguous names, I haven't checked thoroughly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are kind of constrained in the URLs we have to support (as they already exist) so I'll just hardcode a blacklist of forbidden repo names


/**
* Note: we only support branch names and tag names
* not containing a `/`.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you're coming from but at least in the software engineering world feature branches (feature/xyz) are very common.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes – for a v1 for the sake of simplicity I think this will work but we can always add support later

@jonashaag
Copy link
Owner

Reviewed most of the code and also did some performance tests on my machine. Implementation and performance look very good. libgit2 performance is 1–2 order of magnitudes better than Dulwich performance, apparently.

I added a few minor review comments. I don't think there are any obviously bad implementation choices.

I suggest that you performance monitor as much as possible in production: Some of the history-based views are unbounded (eg. counting number of commits) and may turn out to be too slow. It probably also makes sense to add caching here and there if you have an easy way to do cache invalidation (with the current refresh mechanism the page is essentially static so you could just cache everything).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants