Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📜 docs: improve section on private steps #2976

Open
lucasrodes opened this issue Jul 15, 2024 · 1 comment
Open

📜 docs: improve section on private steps #2976

lucasrodes opened this issue Jul 15, 2024 · 1 comment
Assignees
Labels
documentation Improvements or additions to documentation needs discussion

Comments

@lucasrodes
Copy link
Member

lucasrodes commented Jul 15, 2024

One-liner

The guideline on private steps is not be up-to-date. We should probably review and update it to match our current data flow.

Context & details

In #2947, we worked on improving sections in our docs talking about private steps. However, there are still some doubts on certain fragments, in particular those relating to metadata field isPrivate and nonRedistributable and whether datasets are published to GitHub. These were raised by Pablo in his revision (here, or here).

The current text in our docs were based on the following issue-closing comment in #2631 (comment):

Turns out this was a false alarm due to a misunderstanding of the meaning of something being "private".

In the ETL, private means that the general public cannot access those files, except when they are published as indicators in the grapher:// step. At that stage, anything private should be marked as nonRedistributable in the metadata.

In Grapher, datasets marked as !isPrivate && !nonRedistributable are automatically re-published to Github. If something is !nonRedistributable, it means CSV download is available with Grapher.

This means !isPrivate should probably be renamed publishToGithub, and it should be false any time nonDistributable is true.

Originally posted by @larsyencken in #2631 (comment)

TODO

  • Public relies on a private snapshot?
  • Snapshot has the flag, but doesn't have the prefix in the DAG.
  • non_redistributable vs. private (create a 2x2 matrix)
    • true true: all private, makes sense
    • true false: private in Grapher, public in ETL. doesnt make sense?
    • false false: all public, makes sense
    • false true: we only allow for a slice of the data to be downloadable, makes sense
  • making data private in staging. does it make sense?
  • isPrivate flag in Grapher dataset table.
  • Use Enum instead of boolean flags ('fully public', 'partly private', etc.)
  • Tooling publicprivate
@lucasrodes lucasrodes self-assigned this Jul 15, 2024
@lucasrodes lucasrodes added the documentation Improvements or additions to documentation label Jul 15, 2024
Copy link

stale bot commented Sep 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Sep 17, 2024
@lucasrodes lucasrodes removed the wontfix This will not be worked on label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation needs discussion
Projects
None yet
Development

No branches or pull requests

2 participants