Create an `Argument` class and allow convertion of optdict into them #5584

DanielNoord · 2021-12-21T23:48:03Z

Add yourself to CONTRIBUTORS if you are a new contributor.
Write a good description on what the PR does.

Type of Changes

	Type
✓	✨ New feature
✓	🔨 Refactoring

Description

This is my first proposed step towards #5392.

We could build on this in a separate branch, as I intend not to make this interfere too much with any current code. That should keep merge conflicts low.

All of this is based on the documentation of argparse found here:
https://docs.python.org/3/library/argparse.html

The idea of this PR is to showcase how we can transform our current optdicts into a Argument class that we can then at some point start passing to the ArgumentsManager. I think that second part would be the next step, to make this in a MVP. After that we can start working on adding new checkers.
I chose the logging module as it is a fairly easy module without only two options.

Don't hold back! Let me know what you think of this. All criticism is valuable!

Note: Ideally these commits wouldn't need to be squashed in a final merge. That makes cherry-picking later on much easier (see 2.0 branch in astroid). So after final review I'll revisit the commit history.

coveralls · 2021-12-22T00:17:18Z

Pull Request Test Coverage Report for Build 2058346839

90 of 92 (97.83%) changed or added relevant lines in 9 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.02%) to 94.173%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
pylint/checkers/base_checker.py	9	10	90.0%
pylint/config/arguments_manager.py	41	42	97.62%

Totals
Change from base Build 2057470749:	0.02%
Covered Lines:	15418
Relevant Lines:	16372

💛 - Coveralls

DanielNoord · 2021-12-22T00:18:17Z

If anybody could help me figure out why this fails on Windows that would be much appreciated!

jacobtylerwalls · 2021-12-22T00:37:16Z

I think it's that subroutines can't be pickled on Windows. Any way we can avoid pickling ArgumentParser?

edit: ArgumentParser implementation

DanielNoord · 2021-12-22T07:46:22Z

I'll try and come up with something!

DanielNoord · 2021-12-22T09:42:10Z

I have fixed the issue with argparse.ArgumentsParser pickling. It is annoying as it means we'll need to instantiate an ArgumentsParser and extract any relevant information during the init of the ArgumentsManager.
On the other hand, that does force us to think clearly about how we handle arguments parsing and might help us avoid any redundant calls 😄

DanielNoord · 2021-12-22T16:09:14Z

Converting this to draft as I found issues with the current implementation. In retrospect I think it doesn't make sense to break this into smaller PRs, as we might run into problems after earlier parts have been merged.

Therefore, I'll keep working on this until we can completely use argparse to parse the options for the logging module. That should be a good test that at least basic functionality is guaranteed.

Pierre-Sassoulas

Do we really need new classes ? Or could we use argparse.Namespace directly ? optparse is more complex so if we try to do a "small refactor" we'll keep the optparse complexity in our implementation even when using argparse.

One thing I had in mind for this refactor was separating parsing option / checking. Maybe creating NamedTuple or DataClasses and gives them to the checker pre-parsed instead of parsing the option inside each checker (in order to not have breaking change we could parse the option if the object is not given ?).

DanielNoord · 2021-12-22T16:19:58Z

Do we really need new classes ? Or could we use argparse.Namespace directly ? optparse is more complex so if we try to do a "small refactor" we'll keep the optparse complexity in our implementation even when using argparse.

I think it is better to create new classes as it allows us to more easily work on this. If we start to inject into the pre-existing optparse based classes we will create merge conflicts very easily.

One thing I had in mind for this refactor was separating parsing option / checking. Maybe creating NamedTuple or DataClasses and gives them to the checker pre-parsed instead of parsing the option inside each checker (in order to not have breaking change we could parse the option if the object is not given ?).

I'm not sure what you mean here.

I think the flow would be 1) register options on parser object, 2) parse options with parser object, 3) set namespace object as config namespace of linter object.

DanielNoord · 2021-12-22T16:43:12Z

This might be more difficult than I thought. I had a pretty good idea of how to not require argparse.ArgumentParser as subclass of PyLinter, but we need some way to store the ArgumentParser. Otherwise we would need to register all options every time we want to change any of them.

The problem is that even Run gets pickled, so I don't know a good place to register an ArgumentParser and store it.. I'll keep thinking about this, but this is a very large limitation of argparse which I'm not sure we can overcome..

Pierre-Sassoulas · 2021-12-22T17:00:37Z

set namespace object as config namespace of linter object.

I agree with 1) and 2) but I don't understand this part.

we need some way to store the ArgumentParser. Otherwise we would need to register all options every time we want to change any of them.

Couldn't we store the result of the parsing, i.e. an argparse.Namespace ?

DanielNoord · 2021-12-22T17:18:57Z

I agree with 1) and 2) but I don't understand this part.

After the parsing is done (ideally in the Run init) the resulting namespace needs to be made accessible to the linter class. So it can be accessed there.

Couldn't we store the result of the parsing, i.e. an argparse.Namespace ?

There is a difference between the Namespace and the storage of arguments. The namespace only stores the current value whereas the parser objects also stores stuff like default value and description. I think it's important that information is not lost after initial parsing of arguments is done, so we'll preferably need to save both the parser and the namespace. Currently I don't see a way to store the parser, as it seems everything within Run is pickled..

Pierre-Sassoulas · 2021-12-22T19:35:08Z

the resulting namespace needs to be made accessible to the linter class.

Ok I agree with that too.

I think it's important that information is not lost after initial parsing of arguments is done,

Why ? Where could we use the default value and the description after the parsing is done ?

DanielNoord · 2021-12-22T19:46:45Z

Why ? Where could we use the default value and the description after the parsing is done ?

We would need it to "reparse" for sub directories and for help messages. We would then need to recollect and reinstantiate all options for every subdirectory. That's not really good performance wise..,

Pierre-Sassoulas · 2021-12-22T23:08:22Z

Can't we parse all the options for each directories first and store them ? (By the way the configuration by directory feature was removed from flake8 because it was bringing "90% of the bugs" we might want to reconsider that one)

DanielNoord · 2021-12-23T13:53:32Z

Can't we parse all the options for each directories first and store them ? (By the way the configuration by directory feature was removed from flake8 because it was bringing "90% of the bugs" we might want to reconsider that one)

That might be possible, but that would definitely come later.

However, I think I might have found a way to do it. We won't store information about the arguments this way, but I think we could design everything in such a way that that is not necessary.

DanielNoord · 2021-12-29T09:02:51Z

@Pierre-Sassoulas Would you consider using dill to pickle the PyLinter object for the multiprocessing handling? I'm basing this on this SO question which seems to suggest it should be able to fix our problem:
https://stackoverflow.com/questions/8804830/python-multiprocessing-picklingerror-cant-pickle-type-function

According to their README it has been download 200M+ times via Pypi and supports almost all python versions (Python >=2.7, !=3.0.*). Being able to pickle the input to multiprocessing.Pool would save all of the issues and allow us to make a more intuitive system.

Some additional arguments for why I'm exploring fixing the multiprocessing issue instead of working around: we can't be sure that all plugins and tools (such as prospector) create a Run class. In our own tests we don't do so, but after some additional thoughts I don't think we should do stuff in the Run __init__ that is needed for PyLinter to work. (That's counterintuitive and bad design).
Thus, my third (fourth, fifth?) attempt at this doesn't really work. I hoped that by doing the argument parsing in Run we could skip the pickling.

Pierre-Sassoulas · 2021-12-29T09:19:03Z

Look like dill do not have any dependencies of its own and its 3-clause BSD license seems compatible with pylint. We would more than double their download per day by adding them as a dependencies but it's already used a lot.

I don't think we should do stuff in the Run init that is needed for PyLinter to work. (That's counterintuitive and bad design).

Depends on what we want in PyLinter, right now it's doing almost everything. There's a problem with breaking change in downstream plugins to consider for sure, but let's put that aside for a minute If I understood what we're talking about and the current state of the code, PyLinter (each checker even) is parsing the options right now ? Parsing the options first before launching anything and especially the parallel run actually make more sense imo ? i.e. PyLinter could be the class that analyses a set of file given a set of parsed options and have less responsibilities. Alternatively we could include multiprocessing handling in PyLinter if we want PyLinter and not Run to be the entrypoint ?

DanielNoord · 2021-12-29T09:28:33Z

Look like dill do not have any dependencies of its own and it's 3-clause BSD license seems compatible with pylint. We would more than double their download per day by adding them as a dependencies but it's already used a lot.

I'll start work on a PR implementing it for our parallel run then, so we can continue discussion about actually doing this in a separate PR 😄

Depends on what we want in PyLinter, right now it's doing almost everything. There's a problem with breaking change in downstream plugins to consider for sure, but let's put that aside for a minute If I understood what we're talking about and the current state of the code, PyLinter (each checker even) is parsing the options right now ?

Since every checker is an OptionsProvider each plugin or checker added after initialising PyLinter can add new options that need to be read, "added to the parser" and then "parsed into the Namespace". The flow is as follows:

Initialise Run, 2. Initialise PyLinter and its associated parser, 3. "Add" all options from the PyLinter class, 4. "Parse" PyLinter options, 5. "Add" all options from default checkers and plugins, 6. "Parse" all options from default checkers and plugins, 7. "Add" all options from extra plugins, 8. "Parse" all options from extra plugins.

Note load_defaults in register_checker here:
https://github.com/PyCQA/pylint/blob/e815843293adf100662fe4e183c34f3040529a15/pylint/lint/pylinter.py#L754-L763

The problem is thus that at step 5 and step 7 we need to re-initialise an ArgumentsParser if we can't store the parser as an attribute or subclass of PyLinter. Adding the parser to Run doesn't help as for step 7 we don't have access to Run anymore.
I have tested what would happen if we had to re-initialise an ArgumentsParser 100 times and it became a significant impact on runtime.

Parsing the options first before launching anything and especially the parallel run actually make more sense imo ? i.e. PyLinter could be the class that analyses a set of file given a set of parsed options and have less responsibilities. Alternatively we could include multiprocessing handling in PyLinter if we want PyLinter and not Run to be the entrypoint ?

We can't really do this as we need to parse the load-plugins option before knowing all the options we have to parse.

tests/config/data/logging_format_interpolation_style.py

Pierre-Sassoulas

Look good already. I only skimmed because I'm not an expert in argparse custom parser and even less so in optparse so it's a hard review for me.

.pre-commit-config.yaml

pylint/config/utils.py

DanielNoord · 2022-03-26T22:05:09Z

@Pierre-Sassoulas Do you want others to review this as well before merging? @jacobtylerwalls possibly?

Pierre-Sassoulas · 2022-03-26T22:14:05Z

More reviews is always a good thing if we can afford the luxury :)

jacobtylerwalls

Terrific, thanks for moving this migration forward. A few questions, some for my own understanding.

pylint/config/config_initialization.py

pylintrc

tests/config/test_argparse_config.py

pylint/config/argument.py

pylint/config/arguments_manager.py

jacobtylerwalls · 2022-03-29T11:54:29Z

Thanks for answering my questions. Is now a good time to edit the branch history?

DanielNoord · 2022-03-29T11:55:15Z

I guess so. I'll do a rebase.

DanielNoord · 2022-03-29T12:06:21Z

@Pierre-Sassoulas Please rebase and merge if you're okay with this and the number of reviews 😄

Pierre-Sassoulas

Great refactor @DanielNoord !

DanielNoord added Enhancement ✨ Improvement to a component Configuration Related to configuration labels Dec 21, 2021

DanielNoord mentioned this pull request Dec 21, 2021

Create new Argument class for options #5152

Closed

2 tasks

DanielNoord marked this pull request as draft December 22, 2021 07:46

DanielNoord force-pushed the argparse-logging branch 3 times, most recently from c447d7b to b9b957a Compare December 22, 2021 08:59

DanielNoord marked this pull request as ready for review December 22, 2021 09:40

Pierre-Sassoulas changed the title ~~Create an Argument class and allow covertion of optdict into them~~ Create an Argument class and allow convertion of optdict into them Dec 22, 2021

DanielNoord marked this pull request as draft December 22, 2021 15:56

DanielNoord force-pushed the argparse-logging branch from b9b957a to 3266477 Compare December 22, 2021 16:07

Pierre-Sassoulas reviewed Dec 22, 2021

View reviewed changes

DanielNoord marked this pull request as ready for review December 30, 2021 11:55

DanielNoord force-pushed the argparse-logging branch from 74b1499 to d84b3dd Compare December 30, 2021 14:23

DanielNoord mentioned this pull request Feb 23, 2022

Add support for per-directory configuration #5833

Closed

6 tasks

DanielNoord force-pushed the argparse-logging branch from d84b3dd to a23c06b Compare March 24, 2022 23:07

DanielNoord commented Mar 24, 2022

View reviewed changes

tests/config/data/logging_format_interpolation_style.py Outdated Show resolved Hide resolved

DanielNoord requested a review from Pierre-Sassoulas March 24, 2022 23:16

Pierre-Sassoulas reviewed Mar 25, 2022

View reviewed changes

.pre-commit-config.yaml Outdated Show resolved Hide resolved

pylint/config/utils.py Show resolved Hide resolved

DanielNoord requested a review from Pierre-Sassoulas March 25, 2022 13:30

Pierre-Sassoulas approved these changes Mar 25, 2022

View reviewed changes

Pierre-Sassoulas added this to the 2.14.0 milestone Mar 25, 2022

jacobtylerwalls reviewed Mar 27, 2022

View reviewed changes

DanielNoord requested a review from jacobtylerwalls March 29, 2022 06:45

jacobtylerwalls approved these changes Mar 29, 2022

View reviewed changes

DanielNoord added 7 commits March 29, 2022 13:57

Add argparse.Namespace to list of generated members

846972d

Create _Argument

0250ac8

Create _convert_option_to_argument

7fc7ee0

Create _ArgumentsManager

a8882c2

Use config initialization of _ArgumentsManager

19803ce

Allow BaseChecker to register on a _ArgumentsManager

e081dae

Use the argparse config handler in logging.py and add tests

ba01da1

DanielNoord force-pushed the argparse-logging branch from 0e82e0b to ba01da1 Compare March 29, 2022 12:05

Pierre-Sassoulas approved these changes Mar 29, 2022

View reviewed changes

Pierre-Sassoulas merged commit 0bc45e9 into pylint-dev:main Mar 29, 2022

DanielNoord deleted the argparse-logging branch March 29, 2022 12:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an `Argument` class and allow convertion of optdict into them #5584

Create an `Argument` class and allow convertion of optdict into them #5584

DanielNoord commented Dec 21, 2021 •

edited

Loading

coveralls commented Dec 22, 2021 •

edited

Loading

DanielNoord commented Dec 22, 2021

jacobtylerwalls commented Dec 22, 2021 •

edited

Loading

DanielNoord commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas left a comment

DanielNoord commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas commented Dec 22, 2021

DanielNoord commented Dec 23, 2021

DanielNoord commented Dec 29, 2021

Pierre-Sassoulas commented Dec 29, 2021 •

edited

Loading

DanielNoord commented Dec 29, 2021 •

edited

Loading

Pierre-Sassoulas left a comment

DanielNoord commented Mar 26, 2022

Pierre-Sassoulas commented Mar 26, 2022

jacobtylerwalls left a comment

jacobtylerwalls commented Mar 29, 2022

DanielNoord commented Mar 29, 2022

DanielNoord commented Mar 29, 2022 •

edited

Loading

Pierre-Sassoulas left a comment

Create an Argument class and allow convertion of optdict into them #5584

Create an Argument class and allow convertion of optdict into them #5584

Conversation

DanielNoord commented Dec 21, 2021 • edited Loading

Type of Changes

Description

coveralls commented Dec 22, 2021 • edited Loading

Pull Request Test Coverage Report for Build 2058346839

💛 - Coveralls

DanielNoord commented Dec 22, 2021

jacobtylerwalls commented Dec 22, 2021 • edited Loading

DanielNoord commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

DanielNoord commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas commented Dec 22, 2021

DanielNoord commented Dec 22, 2021

Pierre-Sassoulas commented Dec 22, 2021

DanielNoord commented Dec 23, 2021

DanielNoord commented Dec 29, 2021

Pierre-Sassoulas commented Dec 29, 2021 • edited Loading

DanielNoord commented Dec 29, 2021 • edited Loading

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

DanielNoord commented Mar 26, 2022

Pierre-Sassoulas commented Mar 26, 2022

jacobtylerwalls left a comment

Choose a reason for hiding this comment

jacobtylerwalls commented Mar 29, 2022

DanielNoord commented Mar 29, 2022

DanielNoord commented Mar 29, 2022 • edited Loading

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

Create an `Argument` class and allow convertion of optdict into them #5584

Create an `Argument` class and allow convertion of optdict into them #5584

DanielNoord commented Dec 21, 2021 •

edited

Loading

coveralls commented Dec 22, 2021 •

edited

Loading

jacobtylerwalls commented Dec 22, 2021 •

edited

Loading

Pierre-Sassoulas commented Dec 29, 2021 •

edited

Loading

DanielNoord commented Dec 29, 2021 •

edited

Loading

DanielNoord commented Mar 29, 2022 •

edited

Loading