Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stricter repo parsing #19

Closed
wants to merge 8 commits into from
Closed

Stricter repo parsing #19

wants to merge 8 commits into from

Conversation

yakky
Copy link
Member

@yakky yakky commented Jun 30, 2019

  • Domain must not include a slash
  • Org must not include a slash on github

Fix #18
Fix #10

* Domain must not include a slash
* Org must not include a slash on github
'host': 'gitlab.com',
'user': 'git',
'owner': 'nephila/giturlparse/blob/master/giturlparse',
'repo': 'github.py',
Copy link
Contributor

@jayvdb jayvdb Jul 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, previously github.py would end up in the host or something like that.

So while this is a bad result for repo on GitLab, it is less bad than before this PR, and at least it is working much better for github.

IMO, "owner" shouldnt contain a / on GitLab either. Subgroups should be accessed using a new member named something like path / pathname c.f. https://github.com/retr0h/git-url-parse/pull/37 Or departing from the urlparse tuple, a member called groups which is a list of strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayvdb this is as far as I am comfortable going with supporting the different system details

what would you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you. fwiw, https://github.com/coala/git-url-parse has just been transferred to the @coala group, which means we can put our orgs needs into that codebase. Which then leads to how do we cooperate/compete for the name? c.f. coala/git-url-parse#43

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if the wording sounded a bit rude, it was not meant. It's just that regexps are a PITA to maintain beyond a certain complexity :)
I also forgot to mention that I updated the PR based on your suggestions, and this PR should be able to support more gitlab features in a more natural way.

Regarding git-url-parse, it certainly isn't super smart to have two competing and very similar implementation that covers this use case, thus definitely let's talk about this. I will answer in the linked issue to stay on topic

@yakky yakky force-pushed the feature/strict-regexp branch from 6f7a2d7 to 161bdf2 Compare July 13, 2019 13:59
@property
def path(self):
if self.raw_path:
return self.raw_path.replace('/blob/', '').strip('/')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see the benefit of removing only blob (and not other REST 'commands' like 'tree'), but not also remove the branch name. Keeping the branch name in the path makes it like SVN, but for most VCS the branch and the rest of the path are separate and are used in different commands/parts of the workflow. If they are combined here, people will usually need to split it before they can use it.

If the branch name is also removed, path becomes synonymous with the file path that is found in a local clone.

Exposing the branch as a separate attribute would be useful, but can be done later.

fwiw, git-url-parse uses pathname for what you have exposed as raw_path, to correspond with urlparse()'s pathname. I dont mind about this, as a compatibility shim can do simple mapping like this when the two libraries are close enough that a shim is sufficient.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the capturing group name, I don't see any reason not to match a more sensible name, I will change that

Regarding the other rest commands, we only need to detail them, it makes sense to remove all of them if course

Regarding the branch name I've not come up with a working regexp to tell the branch name from the file path, as the branch name can contain slashes, more than happy to strip the branch from path and expose it as a variable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pathname is now the same as git-url-parse, but I could not find a way to get the branch for the blob verb

tree verb instead can return the branch name

@yakky yakky force-pushed the feature/strict-regexp branch from 70dea17 to cb1b65d Compare November 15, 2020 18:02
@yakky
Copy link
Member Author

yakky commented Nov 15, 2020

@jayvdb I have been able to look at this only after so much time, unfortunately

But I spent some time to add some git-url-parse compatibility

@yakky yakky force-pushed the feature/strict-regexp branch from cb1b65d to 3045e95 Compare November 15, 2020 18:03
@yakky
Copy link
Member Author

yakky commented Nov 21, 2020

@jayvdb this has been replaced by #26 for rebase against current master

@yakky yakky closed this Nov 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URL and repo regex should be non-greedy (maybe) [Feature Request] Support parsing blob and tree urls
3 participants