Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README wrt #238 #239

Merged
merged 2 commits into from
Jul 17, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 31 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# GHCup metadata

This repo is a collection of different GHCup metadata. These are mappings from tool versions (e.g. GHC 9.6.5)
to bindist URLs (e.g. https://downloads.haskell.org/~ghc/9.6.5/ghc-9.6.5-x86_64-fedora33-linux.tar.xz), depending
on architecture (e.g. X86_64), platform (e.g Linux) and possibly distro (e.g. Fedora).

## For end users

### Metadata variants (distribution channels)
Expand Down Expand Up @@ -28,21 +32,37 @@ Also check the [config.yaml documentation](https://github.com/haskell/ghcup-hs/b

## For contributors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After adding a section for "upstream devs" (below), this title sounds a little misleading, because upstream devs are also contributors in some sense. I'd do away with these vague terms and put in titles something less ambiguous. E.g. "Contributing to GHCup main channel" and "Contributing to GHCup vanilla channel (for upstream developers)" or something. Although, this current section is more about how the main channel is set up rather than how to contribute to it. And that's a good thing: we need first to explain what we have here before explaining how to contribute. It's just that the title doesn't quite match.

Having separate sections for how each channel operates before sections explaining step-by-step what to do to contribute would be a great improvement! I'd also expect the step-by-step part to be different for vanilla vs. the main channel because, hopefully, updating vanilla should involve less cooperation from GHCup's side. At least, no bindists. Optionally, no GPG.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant to make this channel specific.

As I said, everyone can provide PRs for everything.

It's probably easier to have one contribution section, where we explain both channels and then remove the upstream dev section and instead explain who is maintaining/responsible for what.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that depending on a channel you want to contribute to, the requirements are different. If that's the case, this should be made more explicit. If it's not the case, then nevermind... At least, the vanilla channel is special in that it doesn't involve building bindists in those separate repositories that you mention at the top ([stable-haskell/cabal](https://github.com/stable-haskell/cabal) and others). This should be spelled out: this is a major difference!


### Adding a new GHC version
Most bindists are built downstream by the GHCup developers, e.g.:

* https://github.com/stable-haskell/haskell-language-server
* https://github.com/stable-haskell/cabal
* https://github.com/stable-haskell/stack

For GHC bindists there is no automation yet and it is done manually (e.g. for FreeBSD and Alpine i386).

That makes contributions harder, because your pull requests to the metadata will be partial.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For GHC bindists there is no automation yet and it is done manually (e.g. for FreeBSD and Alpine i386).
That makes contributions harder, because your pull requests to the metadata will be partial.
For GHC bindists there is no automation yet and it is done manually (e.g. for FreeBSD and Alpine i386).
That makes contributions harder, because your pull requests to the metadata will be partial.

If "that" refers to "no automation", this should be one paragraph. Otherwise, it looks strange.


## For upstream developers

GHC, Cabal, HLS and stack developers distribute their own bindists via the `ghcup-vanilla-A.B.C.yaml` files. These

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this section only covers PRs to the vanilla channel. In that case, I'd suggest to mention it in the title:

Suggested change
## For upstream developers
GHC, Cabal, HLS and stack developers distribute their own bindists via the `ghcup-vanilla-A.B.C.yaml` files. These
## For upstream developers: the `vanilla` channel
GHC, Cabal, HLS and stack developers distribute their own bindists via the `ghcup-vanilla-A.B.C.yaml` files. These

After this fix it may be good to change the previous title too, e.g.:

## For contributors: the main channel

The specifics of this may be improved but the general idea is putting more emphasis on the channels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everyone can provide PRs for whatever they want. I'm just outlining who is responsible for maintaining which metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everyone can provide PRs for whatever they want.

If those PRs will not be accepted because the author doesn't have a permission to run a pipeline, this makes no sense. "Can provide PRs" is not the same as "can contribute" (as in: influence the contents of the repo), and I thought you are describing the latter in this README.

files are primarily maintained by upstream developers at their own discretion. The GHCup project will perform no QA, fixes
etc. on these bindists.

1. open the latest `ghcup-<yaml-ver>.yaml`
2. find the latest ghc version (in yaml tree e.g. `ghcupDownloads -> GHC -> 8.10.7`)
3. copy-paste it
4. adjust the version, tags, changelog, source url
5. adjust the various bindist urls (make sure to also change the yaml anchors)
6. run `cabal run ghcup-gen -- check -f ghcup-<yaml-ver>.yaml`
7. run `cabal run ghcup-gen -- check-tarballs -f ghcup-<yaml-ver>.yaml -u 'ghc-8\.10\.8'`
8. run `cabal run ghcup-gen -- generate-hls-ghcs -f ghcup-<yaml-ver>.yaml --format json -o hls-metadata-0.0.1.json`
9. run `cabal run ghcup-gen -- generate-table -f ghcup-<yaml-ver>.yaml --stdout` and adjust [docs/install](https://gitlab.haskell.org/haskell/ghcup-hs/-/blob/master/docs/install.md) tables
This allows a clean separation between projects. Also see
[GHCup is not an installer](https://hasufell.github.io/posts/2023-11-14-ghcup-is-not-an-installer.html).

Updates to `ghcup-A.B.C.yaml` are carried out by the GHCup project.

### Understanding tags

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think pulling up the tags section here (from the bottom, where they used to be) is a good idea: tags are such low-level technical detail in comparison to the whole project of coordinating binaries distribution of the several core tools.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tags are a fundamental part of the metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough! This is fine with me then.


Tags are documented [here](https://github.com/haskell/ghcup-hs/blob/master/lib/GHCup/Types.hs). Search for `data Tag`.
Some tags are unique. Uniqueness is checked by `cabal run ghcup-gen -- check -f ghcup-<yaml-ver>.yaml`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time ghcup-gen is mentioned in the document. Can this tool be introduced separately? I mean it'd be good to have a general idea of what it does and where I can get it before diving into such specifics as checking tags. Does this text assume that I'm in a local checkout of ghcup-hs rather than ghcup-metadata? Be default, I assume the latter because I'm reading README for ghcup-metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it assumes ghcup-metadata checkout.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! Let's explicitly say so and give a tiny introduction to ghcup-gen, perhaps?


If you want to check prereleases, do: `cabal run ghcup-gen -- check -f ghcup-prereleases-<yaml-ver>.yaml --channel=prerelease`

### During a pull request

* make sure to always add new versions to both `ghcup-A.B.C.yaml` and `ghcup-vanilla-A.B.C.yaml`
* For local testing see `cabal run ghcup-gen -- --help`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be easier to understand the process if some high-level ideas were conveyed without referencing --help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not very high level.

Those are just support tools and the subcommands are mostly self explanatory. Some of them are used in the release process.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ghcup-gen will be introduced above with a couple of sentences, this bullet-point may be okay.

* make sure to run the bindist action to check tool installation on all platforms: https://github.com/haskell/ghcup-metadata/actions/workflows/bindists.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this apply to vanilla channel updates? I'd hope not: ghcup doesn't generate bindists for this channel, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ghcup-gen does not build bindists. So yes, it applies for both.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! With a quick introduction of ghcup-gen somewhere, my question wouldn't even exist!

- this is a manual pipeline

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the most trouble with this step: it seems to assume that I have permission to run workflows on this repository, doesn't it? It should be stated loud and clear, perhaps in a "Prerequisites" section: "In order to submit a PR for any channel, you need to have the permissions to run workflows on this repository". Maybe I'm misunderstanding and you expect me (someone who does not have the permissions) to create a fork, allow GtiHub Actions on the fork, and run the pipeline there? This is not trivial and should be spelled out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you need permissions. At least one developer of each upstream project has permissions.

From the cabal team that is @Kleidukos

I'm ok with adding another, but I'm reluctant to give out access like candy.

This is a sensitive repository with no safeguard between commits and the end user.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying you should give out access. I'm saying (1) the requirement is not documented, (2) it makes contributing for anyone except a very small group of people impossible. At least, this should be put at the top of the page in bold -- this is a major limitation of the current process.

Why don't we come up with a less restrictive process? For example, a contributor may create a fork, allow actions on it, and run the pipeline. Would it be enough? What does the pipeline do exactly? (It's not documented and not self-explanatory!) Do you only need to see that the pipeline is green? If that's all you need, the approach with forks may work and lift the major limitation. If green is not enough and, say, the pipeline uses some secret keys to modify some remote state or something like that, we'll need to think harder. But first -- let's document what you want out of this pipeline exactly.

From the cabal team that is Kleidukos

Have you seen this: https://discourse.haskell.org/t/9124 ? The Cabal team is very liquid and is in a constant search of contributors, including release managers. If we want seamless and timely updates in the ecosystem, we must find more flexible ways to cooperate than giving access to one person once in a while through a process that is not documented anyhow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we come up with a less restrictive process? For example, a contributor may create a fork, allow actions on it, and run the pipeline.

Forks do not have access to private runners. It just doesn't work. And I'm not going to give access to every potential contributor. So I don't think there will be a less restrictive process.

A contributor can still provide a PR and I can assess it myself. This also has security reasons.

Have you seen this: https://discourse.haskell.org/t/9124 ? The Cabal team is very liquid and is in a constant search of contributors, including release managers. If we want seamless and timely updates in the ecosystem, we must find more flexible ways to cooperate than giving access to one person once in a while through a process that is not documented anyhow.

I'm not sure what you are asking. I'm ok with giving 2 or maybe 3 cabal developers access and they can act as proxy for the release manager. I'm not going to give access to every possible release manager.

This is a trust based system. People have messed up before and I have revoked access before. People who have push access need a good understanding of how the metadata works. Yes, I can bombard you with documentation, but I question that will actually solve the issue. The issue is transferring of knowledge and responsive collaboration.

I am extremely responsive on this repo. A new contributor will have to create a couple of PRs before they will gain an understanding of the process. And I'm ok with that. You don't have to understand everything from the start.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your responsiveness! I was trying to do two things at once (maybe that's too much): (1) suggest ideas to improve the description in the README, so that people can understand enough to submit a PR, and (2) suggest improvements for the process. I see now that (2) is not feasible, that's fine.

For (1), I can share my experience as the Cabal release manager for 3.12.1.0: I wasn't able to understand GHCup process for metadata updates enough to create a PR. That's after doing some research: reading readme and recent PRs. @geekosaur went a little further and opened a PR: #225 But that PR was closed in a manner that I can only call abrupt. If that was me I'd be put off and wouldn't consider contributing to this repo anymore. I think it'd be good to improve the description of your process, so that if a PR has to be closed for some reason, you could spell out that reason with a reference to README.

Now, the reason I'm doing (1) is not getting permissions or anything like that. Instead, I want the next Cabal release manager (which may very well be a fresh person) not end up in the same place as @geekosaur and myself. That's why I'm trying to spell out assumptions that you seem to be making (as many as I can) and advocate for adding them to the README. More generally, I am trying to understand how the Cabal team may coordinate with ghcup more effectively. My thesis is that having a better description of the process will help tremendously.

On that note: if we could decide on a slight extension on the current process to say explicitly in the README that a PR may be submitted by anyone without the permissions to run the pipeline, but, since running the pipeline is an essential part of the process, the PR author should explicitly ask you to run the pipeline, that would be great.

What won't work, I think, is to pick a person X from every team, give them the permissions, and think that the whole team will magically understand the whole process. At least, after maintaining Cabal for a couple of years (by maintenance I mean updating various bits that need update to keep the lights on, like updating CI and stuff), I can reassure you: this is not a model that will work with the Cabal project as it stands. This is my personal opinion (as everything else I say), and other Cabal team members can disagree.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used @geekosaur's PR and created a follow-up that ran all the required pipelines: #226

It was merged. The contribution was successful. I did all the outstanding work for you.

Oh, I didn't notice that one! It may help to leave a note about a superceeding PR on the thread for the initial one because GitHub doesn't say email notifications about cross-thread references (you referenced the initial PR in yours but no one got notified about that).

This is also a problem of the process being unclear. Above I suggested that a contributor asks you to run the pipeline (reminder: this can only work if the importance of the pipeline and permissions are explained in the README). Another approach would be you opening a new PR. I guess, it's inevitable because you need the new metadata to be signed, that's fine. Just explaining this process in README would make it easier to wrap our heads around your process.

I noticed #226 deleted rocky linux and the recommended tag. Why is that? If there was an explanation in a reasonably visible place (e.g. README, not source code of the tags datatype), we could save time on producing these lines, and you could save your time on removing them.

I will not give every cabal contributor (or release manager) access to this repository.

Noted. This wasn't something I asked for though. All that I ask for is a description of the process with enough details for some Cabal dev to follow. I wasn't able to do that because the current README is outdated as you acknowledged in our Matrix.

I also gave a talk at ZuriHac about GHCup. The slides are available here: haskell.org/ghcup/talks/Zurihac2024/GHCup.html

That's pretty cool! Some people would prefer text to slides, but referencing the slides from the README would be an improvement already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't notice that one! It may help to leave a note about a superceeding PR on the thread for the initial one because GitHub doesn't say email notifications about cross-thread references (you referenced the initial PR in yours but no one got notified about that).

He did say he was going to do that, because workflows don't work on PRs originating in other repos for whatever reason.

BTW, I was unable to find a way to validate my metadata file (ghcup-gen apparently wasn't it, since it didn't notice that I'd inadvertently duplicated a bunch of tags). Is there a way to do this before submitting a PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I was unable to find a way to validate my metadata file (ghcup-gen apparently wasn't it, since it didn't notice that I'd inadvertently duplicated a bunch of tags). Is there a way to do this before submitting a PR?

ghcup-gen check does check for duplicated tags:

https://github.com/haskell/ghcup-metadata/blob/develop/ghcup-gen/Validate.hs#L135-L165

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. This wasn't something I asked for though. All that I ask for is a description of the process with enough details for some Cabal dev to follow.

Tbh, I feel like I'm losing energy on this topic.

I suggest you simply try to contribute. Go through:

If you still have specific questions afterwards, feel free to ask. Also feel free to fix the documentation/take up this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Thanks for the insightful discussion: it will help future generations of Cabal release managers! Please, make sure to merge this PR. And keeping the README up to date would be going above and beyond! Thanks again!

- set the appropriate parameters
Expand All @@ -53,10 +73,3 @@ Also check the [config.yaml documentation](https://github.com/haskell/ghcup-hs/b
* https://github.com/haskell/ghcup-hs/blob/master/docs/guide.md#gpg-verification
* https://github.com/haskell/ghcup-hs/blob/master/docs/install.md#unix

### Understanding tags

Tags are documented [here](https://github.com/haskell/ghcup-hs/blob/master/lib/GHCup/Types.hs). Search for `data Tag`.
Some tags are unique. Uniqueness is checked by `cabal run ghcup-gen -- check -f ghcup-<yaml-ver>.yaml`.

If you want to check prereleases, do: `cabal run ghcup-gen -- check -f ghcup-prereleases-<yaml-ver>.yaml --channel=prerelease`

Loading