Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Most bitnami charts missing from ArtifactHUB. #3973

Closed
orendain opened this issue Aug 7, 2024 · 21 comments
Closed

Most bitnami charts missing from ArtifactHUB. #3973

orendain opened this issue Aug 7, 2024 · 21 comments

Comments

@orendain
Copy link

orendain commented Aug 7, 2024

Describe the bug

Most bitnami charts missing from ArtifactHUB.

To Reproduce

Navigate to any one of the following:

Expected behavior

Bitnami charts indexed and available.

Screenshots

image

Additional context

Cross-posted to bitnami repo in case issue is on that end: bitnami/charts#28765.

@orendain orendain added the bug label Aug 7, 2024
@tegioz
Copy link
Collaborator

tegioz commented Aug 7, 2024

Thanks for reporting this @orendain, looking into it 👍

@hongbo-miao
Copy link

hongbo-miao commented Aug 7, 2024

Hmm, now https://artifacthub.io/packages/search?repo=bitnami&sort=relevance&page=1 only shows 2 Helm charts now (half hour ago was 3), so something is going on. 🥲

@tegioz
Copy link
Collaborator

tegioz commented Aug 7, 2024

They should be back, will post an update shortly (still working on it 😉).

@tegioz
Copy link
Collaborator

tegioz commented Aug 7, 2024

@hongbo-miao out of curiosity, which 2 (or 3) were you seeing before?

Could it be airflow and apache?

@hongbo-miao
Copy link

hongbo-miao commented Aug 7, 2024

Hi @tegioz thanks for fixing!

Oh I happened still have these two pages open, when the issue happened, it was flipping between these 2 and 3:

image

image

@tegioz
Copy link
Collaborator

tegioz commented Aug 7, 2024

This is a small summary of today's issue with Bitnami's repository. If something is unclear please don't hesitate asking any questions 🙂

Some context of how content is indexed

The tracker component in Artifact Hub runs every 30 minutes, checking repositories for changes and indexing them as needed. New packages found are added, the ones that have changed are updated and those no longer available are removed. In the case of Helm charts repositories, the content available in the source repository is defined in the index.yaml file.

What happened today

During the tracker run at 15:30 UTC today (August 7th 2024), we received what it seems to be an incomplete but valid index.yaml when processing the Bitnami's repository. The index only contained all versions of the airflow package and some versions of the apache package (the first two packages in the index file). It's unclear why we received this incomplete index, maybe someone from Bitnami can help a bit with this (maybe it was being generated/overwritten at that time, or something else, we can't really tell from our side).

The tracking process continued as expected: two packages were available in the source repository, so it proceeded to unregister all but those two as expected (all these operations are logged so they can be easily traced).

Why wasn't this fixed automatically

During the next tracker run, at 16:00 UTC, we received the full index. The tracker component uses a timeout of 15 minutes per repository tracked (as we've thousands of them to process), which is usually more than enough to process any repository incrementally.

But in this case, almost all the repository's packages had been removed, and a full re-indexing was needed, so we've increased temporarily this timeout (it's configurable). This is the action we've taken and all the Bitnami's repository packages were added back automatically.

Consequences

Unfortunately some information is lost at the moment. Most of the packages data is extracted from the charts, and all of that has been added back. But other information like packages' stars, subscriptions, webhooks, etc, is stored in our systems. This information is related to the packages in the database. When a publisher removes a package, we remove it as well.

We'll check the database backups tomorrow to try to get back as much as possible.

Improvements

Regardless of the cause why we received that incomplete index, we need to think about some measures to avoid this from happening again in the future. Cleaning up a repository index and removing lots of entries is a valid use case, and we probably need to support it (actually Bitnami did one intentionally some time ago). But we need to make it a bit harder to do it accidentally.

Hope this helps.

Update 1

We've noticed that the issue also took place in the staging environment, where it hasn't been fixed yet.

We've checked the logs to find the exact time at which the repository was processed on each environment:

  • Production: 2024-08-07T15:30:15.764256796Z
  • Staging: 2024-08-07T15:30:05.930378551Z

Both requests received the same exact copy of the repository's index file, which means that it was like that for at least 10 seconds. It'd be great if you could investigate this a bit further on your end and share any findings @carrodher. Thanks!

@tegioz
Copy link
Collaborator

tegioz commented Aug 7, 2024

Hi @tegioz thanks for fixing!

Oh I happened still have these two pages open, when the issue happened, it was flipping between these 2 and 3:

No worries! Thanks, that makes sense 👍

@hongbo-miao
Copy link

hongbo-miao commented Aug 7, 2024

As a user, the Helm chart stars are very valuable for us to find high quality Helm charts.
It would be great to recover Bitnami's stars, thanks! ☺️

@tegioz
Copy link
Collaborator

tegioz commented Aug 7, 2024

Update 1

We've noticed that the issue also took place in the staging environment, where it hasn't been fixed yet.

We've checked the logs to find the exact time at which the repository was processed on each environment:

  • Production: 2024-08-07T15:30:15.764256796Z
  • Staging: 2024-08-07T15:30:05.930378551Z

Both requests received the same exact copy of the repository's index file, which means that it was like that for at least 10 seconds. It'd be great if you could investigate this a bit further on your end and share any findings @carrodher. Thanks!

@carrodher
Copy link

Thank you for providing detailed information about the issue 👏.

This situation seems unusual and is likely a temporary connectivity or CDN issue that can occur unexpectedly. I have reviewed all our support cases and found no complaints about the Helm repository malfunctioning in the last few days. Considering the number of users consuming Helm charts from the Bitnami index.yaml, we typically receive several GitHub issues immediately after any infrastructure problem.

Is there a way for ArtifactHub to be more resilient in such situations?

On a side note, I agree with @hongbo-miao about the importance of stars in ArtifactHub, and it would be great if they could be recovered at some point.

@tegioz
Copy link
Collaborator

tegioz commented Aug 8, 2024

Thanks for checking @carrodher 👍

This situation seems unusual and is likely a temporary connectivity or CDN issue that can occur unexpectedly.

Please note that it wasn't a failed request, network error, or an invalid yaml file received. If anything goes wrong during the check, the process is aborted and no sync takes place. The index was correct, valid yaml (and a valid charts index), but just with 2 charts. It happened twice from two different systems in a period of 10 seconds.

Is the index file generated directly on the same place it's served from? Is it possible that the CDN picked it up while it was being written to disk (serving and caching an incomplete version)? Do you have logs of the index requests served at that exact time (on your servers or the CDN)? It may be interesting to check the size of the file served if that information is being logged.

Is there a way for ArtifactHub to be more resilient in such situations?

We are considering ways of limiting how packages can be removed, but we need to be careful as we don't want to affect other publishers workflows either. At the moment the procedure to remove charts from Artifact Hub is to remove them from the source repository index (which unfortunately and accidentally is what happened yesterday).

Thanks again, we'll look into those stars 🙂

@tegioz
Copy link
Collaborator

tegioz commented Aug 8, 2024

@carrodher issues like this one (bitnami/charts#20348) could by caused by the same underlying problem (incomplete repository index file served). The repository is added successfully, as the index is valid, but it does not contain all charts so the search doesn't produce the expected results.

@tegioz
Copy link
Collaborator

tegioz commented Aug 8, 2024

The following information has been restored from the database backups:

  • Stars
  • Subscriptions
  • Webhooks
  • Production usage

Packages views were anonymized on deletion, so you won't be able to see the last 30 days stats for individual packages until it catches up again.

Please let us know if you notice something else not working as expected 🙂

@stiantoften
Copy link

Seems like this happened again today.

@Exobitt
Copy link

Exobitt commented Jan 13, 2025

This is the same issue again today @tegioz

@tegioz
Copy link
Collaborator

tegioz commented Jan 13, 2025

Thanks for reporting this @stiantoften and @Exobitt.

@carrodher could you please make sure that you never serve an incomplete version of the Bitnami's repository index file? Like I mentioned in #3973 (comment), this can be happening every now and then to more users as well (it works again after a few seconds, so I understand it's hard to reproduce on your side).

It looks like if the index file was generated directly on the same place it's served from. That would explain why sometimes it's picked up while it was being written to disk (serving an incomplete version). If this is the case, maybe you could just generate it in a temporary location and copy it to the final destination once it's ready.

@tegioz
Copy link
Collaborator

tegioz commented Jan 13, 2025

Bitnami's charts have been reindexed again.

@tegioz
Copy link
Collaborator

tegioz commented Jan 13, 2025

Stars and other missing information has been restored as well.

Restoring database backups takes some time, so we need to protect us from this issue until the underlying problem in the Bitnami's repository is addressed. To do that, we've just added an experimental feature that can be enabled per repository, packages deletion protection. When this feature is enabled, AH won't delete packages that are no longer available in the source repository.

This is not ideal, as it will break the synchronization, leaving content listed in AH that's not actually available in the source. But it's better than the current situation, so we'll enable it for the Bitnami's repository for now.

For more details about this issue please see the comments above 🙂

@tegioz
Copy link
Collaborator

tegioz commented Jan 14, 2025

@carrodher
Copy link

Hi Sergio, thank you for bringing this to our attention. We're investigating the root cause of the unexpected index truncation. In the meantime, we've implemented measures to detect and recover the index when this issue occurs.

@tegioz
Copy link
Collaborator

tegioz commented Jan 15, 2025

Awesome, thanks @carrodher 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants