Consider commonality between Validation and Completions (possibly expanded Help and Diagramming) #2458

KathleenDollard · 2024-07-21T18:31:39Z

tl;dr;

Other than Description, most of the concepts behind validation are shared with completions, and in the future expanded help and possibly diagramming. This challenges our current model.

The decision we made to isolate concerns to allow separate replacement and update is great, but I am struggling to find the best way to store and access additional information on options and arguments. I am not yet seeing a similar issue on non-value bearing symbols (commands).

I encountered this as I considered validation, and just could not get help and completions out of my head.

Problem

Types are insufficient to convey the richness of information for values. Except for Description, most of this richer information doesn't align well with a single subsystem and is used at least for validation, completion, and probably help and diagramming.

The problem is easiest to understand with files and directories, and that is where the current main resorts to adding properties such as FileExists on the option/argument that are used in limited cases. It extends to important other cases:

FileExists, etc.
Integers that must be positive, non-zero, etc.
Strings must not be empty (nullable we can model with type)
One of lists that are not enums
Upper and lower bounds of integers/dates
Weird rules like only powers of two or even integers
Grouped values (all/one of)
By intuition: complex types if we supported them, have several characteristics, although they may all be shared with grouped values
Things I have not thought of and future things we cannot think of

All of these would have similar rules for completions and validations and should allow completions to work even if there are no validations (verify) and vice versa.

Also, it would be highly desirable to include these restrictions in expanded help (man page replacement) and auto-generated docs. And if we have a more complete diagramming tool or a whatif tool, it would be highly desirable to have this information.

I am stuck with the intuition that these are characteristics of the desired value symbol, not intrinsic to the use of any subsystem, although there are other values that are intrinsic to subsystems.

What is a characteristic?

All of these characteristics share several things:

Id (string)
Data for the declaration (such as upper/lower bound)
Description (used by expanded help, probably a delegate)
A validation rule (probably a delegate)
A completions rule (probably a delegate)

Generally, only one would apply to a value symbol, except for grouping/complex types. Assuming that uses the same approach, this would require multiple characteristics. There would also be niche cases most easily modeled by multiple characteristics, such as even non-zero numbers or file names shorter than 256.

What is the difference between a characteristic and a validation rule?

A simple validation rule does not affect completions. A characteristic affects both completions and validation.

Why are grouping and complex types relevant?

The difference between a grouping and a complex type is that a complex type is resolved during type conversion and a grouping is not resolved withing parsing, but might be in the application. Either can result in rules "if this is present, this other thing must also be present" or "if this is present, this other thing cannot be present". This is a complex space that we have not designed, but it will impact both validation and completion.

Possible approaches

The benefits and drawbacks are incomplete, and simply intended to start discussion.

If we do nothing, I think we are selecting Approach 3, although there may be other ways of thinking about the problem in relation to our current design.

Approach 1: Declaration on `ValueSymbol` with behavior in subsystem

Create an API, probably using well-known strings which allow an open set of characteristics. Add an IEnumerable<(string, IEnumerable<object>> to ValueSymbol (options and arguments).

Individual subsystems would determine what, if anything, to do with this identifier. This may result in a Dictionary<string, Func<[appropriate signature]> in each subsystem.

I think we can still make this lazy.

Benefits
- Information is available across our subsystems and others, with no concern for what other subsystems are in use
- Characteristics are always entered in the same way by CLI authors which avoids guessing at what subsystem holds a particular piece of data
- Avoids loading subsystems just to include characteristics
- Information is available in core mode, allowing alternate subsystem layers to use the data and easier transition between subsystem layer tools
Drawbacks
- Could be seen as blurring the core/subsystem boundary
- Add complexity to the core
- Strings may not be popular here
- Behavior in different subsystems would not be coordinated - the validation and completion actions maybe difficult to keep in sync
- It may be confusing if we model some subsystem characteristics via a common location and some via a subsystem

Approach 2: Declaration and behavior in `ValueSymbol`

Very similar to approach 1, except the ValueSymbol carries all the information. This would only work if we believe that we have a full set of needs:

Does a value match this characteristics?
What are the available values that match this characteristic?
How do I describe the characteristic to an end user?

And also that the delegates are obvious:

Validation: Func<ParseResult, TValue, IEnumerable<object>, IEnumerable<CliError>>
Completion: Func<ParseResult, IEnumerable<object>, IEnumerable<T>>
Description: `Func<IEnumerable, string>
Benefits and drawbacks:
- Benefits
  - Information is available across our subsystems and others, with no concern for what other subsystems are in use
  - Characteristics are always entered in the same way by CLI authors which avoids guessing at what subsystem holds a particular piece of data
  - Avoids loading subsystems just to include characteristics
  - Information is available in core mode, allowing alternate subsystem layers to use the data and behavior and easier transition between subsystem layer tools
  - Behavior in different subsystems would be coordinated - the validation and completion actions would be easier to keep in sync
- Drawbacks
  - Could be seen as blurring the core/subsystem boundary
  - Add complexity to the core
  - Strings may not be popular here
  - It may be confusing if we model some subsystem characteristics via a common location and some via a subsystem
Approach 3: Use the current design with guidance on where characteristics belong

We could use the current system, which already has a mechanism for storing this data as is, but declare what goes where in guidance. I don't know what guidance we would give since almost everything that will be used in completions would also be used in validation.

If we store in validation, we would load the validation subsystem for completions, almost always.

If we store in completions, we would load completions, but that may be less of an issue as we are on a less critical path.
- Benefits
  - Does not blur the core/subsystem boundary
  - Does not add complexity to the core
  - Strings are not needed
- Drawbacks
  - Information is available across our subsystems and others
  - CLI authors need to know what subsystem holds a particular characteristic
  - Other subsystems would need to map behavior to the other subsystems characteristic
  - Different replacement subsystems may define the data of a declaration in incompatible ways, blocking the ability to independently replace orthogonal subsystems
  - Subsystems will often be loaded just to include characteristics
  - Information will not be available in core mode
  - Behavior in different subsystems would not be coordinated - the validation and completion actions might be difficult to keep in sync
Approach 4: Add a new subsystem for characteristics, possibly combine with the Value subsystem

This is similar to Approach 3 in that characteristics are isolated to the subsystem layer, and to Approach 1 (or 2 depending on design) in that the characteristic declaration is in a single location.
- Benefits
  - Information is available across our subsystems and others, with the value subsystem always in use
  - Does not blur the core/subsystem boundary
  - Characteristics are always entered in the same way by CLI authors which avoids guessing at what subsystem holds a particular piece of data
  - Does not add complexity to the core
- Drawbacks
  - The subsystem would always be loaded, and if the value subsystem is used, it will be loaded even when not otherwise (completions and help)
  - Information is not available in core mode
  - A string ID will probably still be needed
  - Behavior in different subsystems would not be coordinated - the validation and completion actions might be difficult to keep in sync
  - It may be confusing if we model some subsystem characteristics via a common location and some via a generalized subsystem

The text was updated successfully, but these errors were encountered:

Balkoth · 2024-07-22T14:19:11Z

I thought that resetting with Powderhouse would finally bring the project across the finish line. But you keep posting more and more issues under this label and to me there seems no real direction and progress on System.CommandLine. There seems to be no established baseline of what has to absolutely be in the project to finally get a release out.

KathleenDollard · 2024-07-24T13:07:37Z

We are designing in the open. Yes. We are not making the progress we wanted due to a variety of personal issues on the all-volunteer team. Yes. We are also working on strategies to resolve that, but it is safe to say we will not be releasing soon.

The feature baseline is pretty straightforward here - an extensible model for completions and validation and replacement and expanded possibilities for FileExists. All of which are in the current main.

These are reasonable comments, but it's turned out to be a slightly unfortunate location. These were the notes for a design meeting we had yesterday and this has spawned a couple of issues that are much more clear on the design we're going forward with. As a result, I will be closing this issue. I wanted to clarify that closing it is unrelated to your comments.

KathleenDollard · 2024-07-24T13:09:03Z

Superseded by #2464

KathleenDollard added the Powderhouse Work to isolate parser and features label Jul 21, 2024

KathleenDollard mentioned this issue Jul 24, 2024

Updated Subsystem data storage #2464

Open

KathleenDollard mentioned this issue Aug 17, 2024

# Powderhouse Validation #2476

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider commonality between Validation and Completions (possibly expanded Help and Diagramming) #2458

Consider commonality between Validation and Completions (possibly expanded Help and Diagramming) #2458

KathleenDollard commented Jul 21, 2024

Approach 3: Use the current design with guidance on where characteristics belong

Approach 4: Add a new subsystem for characteristics, possibly combine with the Value subsystem

Balkoth commented Jul 22, 2024

KathleenDollard commented Jul 24, 2024

KathleenDollard commented Jul 24, 2024

Consider commonality between Validation and Completions (possibly expanded Help and Diagramming) #2458

Consider commonality between Validation and Completions (possibly expanded Help and Diagramming) #2458

Comments

KathleenDollard commented Jul 21, 2024

Problem

What is a characteristic?

What is the difference between a characteristic and a validation rule?

Why are grouping and complex types relevant?

Possible approaches

Approach 1: Declaration on ValueSymbol with behavior in subsystem

Approach 2: Declaration and behavior in ValueSymbol

Approach 3: Use the current design with guidance on where characteristics belong

Approach 4: Add a new subsystem for characteristics, possibly combine with the Value subsystem

Balkoth commented Jul 22, 2024

KathleenDollard commented Jul 24, 2024

KathleenDollard commented Jul 24, 2024

Approach 1: Declaration on `ValueSymbol` with behavior in subsystem

Approach 2: Declaration and behavior in `ValueSymbol`