Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.1.5] Repo webhook on GHES side : 404 page not found #332

Closed
Fabiosilvero opened this issue Jan 31, 2025 · 6 comments
Closed

[0.1.5] Repo webhook on GHES side : 404 page not found #332

Fabiosilvero opened this issue Jan 31, 2025 · 6 comments

Comments

@Fabiosilvero
Copy link

Fabiosilvero commented Jan 31, 2025

Hello,

We're running GHES 3.12 and we're trying to setup repo webhook.

Garm-server is in kubernetes running the ghcr.io/cloudbase/garm:v0.1.5 image :

configMaps:
  config.toml: |-
    [default]
    enable_webhook_management = true
    
    [logging]
    # If using nginx, you'll need to configure connection upgrade headers
    # for the /api/v1/ws location. See the sample config in the testdata
    # folder.
    enable_log_streamer = true
    # Set this to "json" if you want to consume these logs in something like
    # Loki or ELK.
    log_format = "text"
    log_level = "debug"
    log_source = false

    [metrics]
      enable = true
      disable_auth = false

    [jwt_auth]
    secret = "awesome_secret_redacted"
    time_to_live = "8760h"

    [apiserver]
      bind = "0.0.0.0"
      port = 80
      use_tls = false

    [database]
      backend = "sqlite3"
      # This needs to be changed.
      passphrase =  "awesome_secret_redacted"
      [database.sqlite3]
        db_file = "/etc/garm/garm.db"
    
    [[provider]]
      name = "gcp"
      provider_type = "external"
      description = "gcp provider"
      [provider.external]
        provider_executable = "/opt/garm/providers.d/garm-provider-gcp"
        config_file = "/etc/garm/garm-provider-gcp.toml"
        # This is needed if you want GARM to pass this along to the provider.
        environment_variables = ["GOOGLE_APPLICATION_CREDENTIALS"]

  garm-provider-gcp.toml: |-
    project_id = "project_id"
    zone = "gcp_zone"
    network_id = "network_self_link"
    subnetwork_id = "subnetwork_self_link"
    # The credentials file is optional.
    # Leave this empty if you want to use the default credentials.
    credentials_file = "/etc/garm/service-account-key.json/sa.key"
    external_ip_access = false

The volumes are correctly mounted : GARM is up and running, and GCE VMs are correctly created/deleted and can reach GHES and Garm.

I used this command to create the repository on GARM :

/home/user/bin/garm-cli repository add \
    --name github-actions \
    --owner <The_Org> \
    --credentials github-pat \
    --install-webhook \
    --pool-balancer-type roundrobin \
    --random-webhook-secret

/home/user/bin/garm-cli pool create \
    --os-type linux \
    --os-arch amd64 \
    --enabled=true \
    --flavor e2-medium \
    --image  <GCE_IMAGE_SELF_LINK> \
    --min-idle-runners 0 \
    --repo <the_ID> \
    --tags poc-garm \
    --provider-name gcp

On GHES side the webhook exists and seems to be configured, although all events get a 404 and workflow hangs forever :

./garm-cli runner list --all (for 5 min)
+----+------+--------+---------------+---------+
| NR | NAME | STATUS | RUNNER STATUS | POOL ID |
+----+------+--------+---------------+---------+
+----+------+--------+---------------+---------+

+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
| ID                                   | OWNER | NAME           | ENDPOINT    | CREDENTIALS NAME | POOL BALANCER TYPE | POOL MGR RUNNING |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
|  <the_ID> | The_Org | github-actions | my-ghes | github-pat       | roundrobin         | true             |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+

./garm-cli controller show
+-------------------------+---------------------------------------------------------------------------+
| FIELD                   | VALUE                                                                     |
+-------------------------+---------------------------------------------------------------------------+
| Controller ID           | ID                                      |
| Hostname                | garm-server-0                                                             |
| Metadata URL            | https://stg-garm.my.dns.zone/api/v1/metadata                              |
| Callback URL            | https://stg-garm.my.dns.zone/api/v1/callbacks                             |
| Webhook Base URL        | https://stg-garm.my.dns.zone/webhook                                      |
| Controller Webhook URL  | https://stg-garm.my.dns.zone/webhook/ID |
| Minimum Job Age Backoff | 30                                                                        |
| Version                 | v0.1.5                                                                    |
+-------------------------+---------------------------------------------------------------------------+

 ./garm-cli github endpoint list
+-------------+------------------------------+-------------------------+
| NAME        | BASE URL                     | DESCRIPTION             |
+-------------+------------------------------+-------------------------+
| github.com  | https://github.com           | The github.com endpoint |
+-------------+------------------------------+-------------------------+
| my-ghes | https://github.my.dns.zone | My GHES             |
+-------------+------------------------------+-------------------------+

Image

Image

The job_count in logs stays at 0 despite me triggering redeliver on the webhook failed event.

Am I missing something ?

WIth a minidlerunners at 2, I have runners on GHES side and in garm via the garm-cli runner list --all command but since the webhook doesn't work it doesn't scale :/

Note : I'm using GARM operator but I created the repo and pool manually to exclude an issue from it.

I also confirmed that network flow GHES => GARM is correct :

admin@github:~$ curl https://stg-garm.my.dns.zone/
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1/metadata
{"error":"Authentication failed","details":""}
admin@github:~$ curl https://stg-garm.my.dns.zone/webhook/84474b45-b8e7-4350-9be0-012531f22388
404 page not found

Log is attached to issue. Let me know if I can help to troubleshoot further.

garm.log

Thanks,

@gabriel-samfira
Copy link
Member

Hi @Fabiosilvero

I think you have a typo in your webhooks URL. The correct path should be https://stg-garm.my.dns.zone/webhooks (plural). You seem to have https://stg-garm.my.dns.zone/webhook (missing trailing s). That is most likely the issue in your setup.

If you run:

curl -I https://stg-garm.my.dns.zone/webhooks/ID

Does it work?

Once you determine the correct webhook URL for your instance, make sure to first uninstall the webhook from your repo/org by running:

garm-cli repository webhook uninstall repo_id

or

garm-cli org webhook uninstall repo_id

Update the controller URL:

garm-cli controller update --webhook-url https://stg-garm.my.dns.zone/webhooks

Then you can install the webhook again:

garm-cli org webhook install repo_id

Also make sure that the callback and metadata URLs work. Use curl to test them. They should return a 401 (unauthorized) error.

@Fabiosilvero
Copy link
Author

... I feel stupid now 😅
Thank you for your rapid answer, with webhooks, that works 😅

@gabriel-samfira
Copy link
Member

no worries! I spent many hours debugging due to typos. Happens to all of us!

@Fabiosilvero
Copy link
Author

I know where it came from : https://github.com/mercedes-benz/garm-operator/blob/main/config/samples/garm-operator_v1beta1_garmserverconfig.yaml
I'll let the maintainers of garm-operator of this, when you don't know GARM at all you can be tempted to just copy paste ^^ And thank you for your kindness

@Fabiosilvero
Copy link
Author

Fabiosilvero commented Jan 31, 2025

Are you interested by an helm chart once I completed my lab ? I didn't found any for garm :)
I would gladly contribute if I can
EDIT : I move the subject to "Discussions" that will be more appropriate I think :)

@gabriel-samfira
Copy link
Member

I think the nice folks at mercedes-benz have that in mind via:

You can follow/ping that thread. I think a potential helm chart should be part of the operator repo.

@cloudbase cloudbase locked and limited conversation to collaborators Jan 31, 2025
@gabriel-samfira gabriel-samfira converted this issue into discussion #334 Jan 31, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants