Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Add numpydoc pre-commit hook #1371

Closed
wants to merge 7 commits into from
Closed

ci: Add numpydoc pre-commit hook #1371

wants to merge 7 commits into from

Conversation

auguste-probabl
Copy link
Contributor

@auguste-probabl auguste-probabl commented Feb 28, 2025

  • Only files in skore/src are checked
  • Some lints are ignored
    • Even then, there remains a lot of work to make numpydoc happy
    • I didn't ignore the "summary line should be on a new line" rule, but actually this is PEP 257-compliant and we could ignore it

Closes #1338

@thomass-dev
Copy link
Collaborator

thomass-dev commented Feb 28, 2025

Surprised that it isn't already effective/covered by the pyproject.toml section

[tool.ruff.lint.pydocstyle]
convention = "numpy"

Do you know why?

Copy link
Contributor

Documentation preview @ 96045fb

@auguste-probabl
Copy link
Contributor Author

auguste-probabl commented Feb 28, 2025

Surprised that it isn't already effective/covered by the pyproject.toml section

It kind of is, but not completely; see the failed CI run for examples of things that aren't checked by ruff. There is an open issue in ruff to cover all of numpydoc: astral-sh/ruff#8425.

See also:

@auguste-probabl
Copy link
Contributor Author

auguste-probabl commented Feb 28, 2025

Personally I don't think that it's worth the trouble right now.

"ES01", # No "extended summary" section
"EX01", # No "Examples" section
"SA01", # No "See also" section
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +123 to +125
"ES01", # No "extended summary" section
"EX01", # No "Examples" section
"SA01", # No "See also" section
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those one are quite important for a nice documentation actually.

@glemaitre
Copy link
Member

Personally I don't think that it's worth the trouble right now.

Right now apart from the long summary, examples and see also that find important to have, we only have 3 failures with the rendering of the help which would could skip if we see that we cannot support properly.

In terms of rule, we have set that we skip in scikit-learn because numpydoc is picky a bit too much on something that does not necessarily improve the documentation.

Regarding the "necessity", I think that we should pretty soon have a user guide and have proper docstring. Otherwise, it is already a project smell when you don't have a clear documentation explaining with words why a project is useful (a getting_started is not enough). So I would think that we should still fix those issues on the way and not only postpone them releases after releases.

@auguste-probabl
Copy link
Contributor Author

Update with different settings. If we don't ignore "extended summary", "example" or "see also", but we do ignore "summary should be on a different line", we get ~2000 lints. Most of them are superfluous IMO.

Yeah the fine-grained approach that sklearn uses sounds like a good start because we can spread out the change over time, so that we don't have to keep a PR open while we fix 2000 errors.

@glemaitre
Copy link
Member

we get ~2000 lints. Most of them are superfluous IMO.

Is it that we get warning on function that are not part of the public API? The code in scikit-learn, is actually reducing this behaviour for the public code base and I would argue that it is indeed enough.

@auguste-probabl
Copy link
Contributor Author

we get ~2000 lints. Most of them are superfluous IMO.

Is it that we get warning on function that are not part of the public API? The code in scikit-learn, is actually reducing this behaviour for the public code base and I would argue that it is indeed enough.

Yes that's right. I don't know what regex to give to numpydoc to filter things correctly (e.g EstimatorReport is defined in _estimator/report.py so _+ is not enough), so we'd need something like sklearn's custom pytest-based system to get something correct (whatever "correct" means for us).

@glemaitre
Copy link
Member

Yep so it is indeed more involved work than just having the linter on. So we might delay that. We just need to ensure that we keep documenting as good as possible in the meantime ;)

@auguste-probabl
Copy link
Contributor Author

What we could also do is start enforcing rules in CI only on changed files, so for a while every PR would include a bit of docs improvement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use numpydoc validation as a linter for the documentation
3 participants