Declarator Docs should be limited in scope #438

lizmat · 2024-09-13T12:58:59Z

See current definition and the original thoughts.

This issue is written out of frustration of trying to implement a more sensible way of dealing with declarator docs in RakuDoc v2, and trying to implement a sensible "safe" renderer.

Regardless of the implementation effort that has gone into this, and what still needs to go into, I wonder how many developers really like this feature, and what they would think if this feature would be removed by default in 6.e.

lizmat · 2024-09-13T13:03:47Z

One reason for the existence of declarator docs, is apparently the perceived usefulness of being able to provide external documentation for code with the code itself.

In my opinion, internal and external documentation are two very different beasts. And declarator docs make it way too easy to mix the two. Thereby either creating documentation with limited usefulness for the maintainer, or for the user. Or in the worst case for both.

lizmat · 2024-09-13T13:06:40Z

Another reason for the existence of declarator docs, is that with a refactoring, it would be easier to update the external documentation as well because it is closer to the code.

In my opinion, a refactor would probably also require explaining the before and after situation in the user documentation. And this would just add clutter to the code for a maintainer.

lizmat · 2024-09-13T13:13:07Z

The third reason for the existence of declarator docs, is that it would make it easier for the developer to maintain the internal and external documentation.

In my experience, that is NOT true. Personally, I have a mindset for writing user documentation. And one for development. They rarely are active for me at the same time. Sometimes I will write user documentation first, as a sort of design of the features.

And sometimes the development starts first, iteratively, without a clear view of the final stage. Once it's getting to a beta stage, would be the first time user documentation would be written. With a different mindset from developing, attempting to look at the code from a user point of view.

lizmat · 2024-09-13T13:59:23Z

thoughts? @finanalyst @thoughtstream @niner @ab5tract

lizmat · 2024-09-14T09:28:53Z

undock (compile a module so that it no longer carries its dock load inline inside the bytecode instead of just a stub/link)

Pretty sure a pragma could be devised for that.

conversely, redock (recompile undocked compiled code to a docked equivalent that inserts the downloaded or locally stored but external dock source)

What would be the purpose of that?

I think that generally the problem is really that variables like $=pod and $=rakudoc are only available from the program itself, and cannot be read externally?

thoughtstream · 2024-09-15T01:04:27Z

Way back in the Jurassic, when I designed the original Pod6,
I was entirely neutral about the idea of declarator blocks.
I was asked to add the feature, but I have never used it myself,
nor to I particularly like the idea of intermixing documentation and code
(see also Perl Best Practices, Chapter 7).

So I would have no problem if we removed the entire concept of declarator blocks.

However, before we contemplate that step, perhaps it would be useful to know
how many people (if any) are actually using the feature – and the related .WHY method – in their code.
I tried to search for .WHY, #|, and #= on raku.land, but seemed to get mostly just
false positives. If anyone knows a better way to determine how widely used the feature is,
that would be a very useful contribution to this discussion.

CIAvash · 2024-09-15T08:00:48Z

I use them for documentation, and also use tools for generating other formats(Markdown for example) from them.
Although the tools are not perfect right now and sometimes need manual editing.

An example from one of my modules:
Raku: https://codeberg.org/CIAvash/APISports-Football/src/branch/main/lib/APISports/Football.rakumod
Markdown: https://codeberg.org/CIAvash/APISports-Football/src/branch/main/README.md

ab5tract · 2024-09-15T11:14:16Z

@lizmat Can you expand on some the implementation hurdles you have been facing? That might help to scope the discussion.

For example, if #= outside of signatures is causing trouble, I would wholeheartedly endorse removing that option. On the other hand, if it's not a source of problems then it is probably safe to keep.

Without having any context, I also wonder whether it could be that this particular piece of RakuDoc doesn't belong in the RakuDoc processing code at all and should rather be part of the "regular" grammar (where we can also add caveats such that most (or even all, if there are circularity issues) of the RakuDoc syntax is unavailable in the "decks" (nice phrasing @raiph!).

Tangent: I've always felt like .WHY was an under-utilized magic feature for the REPL. In a perfect world, I would love to see everything in CORE.setting have a reasonable response to this method call. It's also .WHY I don't personally see much value in the "undocking" functionality described by @raiph -- the docs are there for the user's benefit, so taking it out of the code on its way to distribution doesn't appeal to me.

lizmat · 2024-09-15T11:16:06Z

@CIAvash thanks for your example!

But to me they prove exactly why declarator docs are bad. In the ATTRIBUTES section:

has APISports::Football::HTTPClient $.http_client

An object for making requests to api-football.com

What does the APISports::Football::HTTPClient mean to a user of your module? It isn't documented anywhere else. It's implementation detail leaking out, and as such cluttering the user documentation. Same for TwoChars, AtMost2Digits, MatchStatus.

Also, why would a user be interested in how the signature of a method is implemented?

method matches(
    Bool :h(:$http_body),
    *%params (Int :$id where { ... }, Str :$live where { ... }, Date(Any) :$date, Int :$league where { ... }, Int :$season where { ... }, Int :$team where { ... }, Int :$last where { ... }, Int :$next where { ... }, Date(Any) :$from, Date(Any) :$to, Str :$round, MatchStatus(Str) :$status, Str :$timezone where { ... })
) returns Mu

To me, this really feels like documentation for a maintainer, not for a user.

CIAvash · 2024-09-15T11:35:36Z

Also, why would a user be interested in how the signature of a method is implemented?

That's why I mentioned the tools, they have a lot of room for improvement. Tools should extract the important parts of the code.

What does the APISports::Football::HTTPClient mean to a user of your module?

It probably should link to the documentation of the class, if it is usable by the user.

So I think the problem lies in the tools.

ab5tract · 2024-09-15T11:41:02Z

Also, why would a user be interested in how the signature of a method is implemented?

method matches(
    Bool :h(:$http_body),
    *%params (Int :$id where { ... }, Str :$live where { ... }, Date(Any) :$date, Int :$league where { ... }, Int :$season where { ... }, Int :$team where { ... }, Int :$last where { ... }, Int :$next where { ... }, Date(Any) :$from, Date(Any) :$to, Str :$round, MatchStatus(Str) :$status, Str :$timezone where { ... })
) returns Mu

To me, this really feels like documentation for a maintainer, not for a user.

It can be a fuzzy line, to be sure. But considering that, post-RakuAST, the where blocks will be fully documented, I think there is significant value to sharing the signature. Also, is the signature appearing in the documention even related to the declaration syntax?

FWIW, most other from-source-generated documentation I've encountered take a fair amount of space for displaying to the user -- for a module's API, this would be a developer -- what arguments a given routhine will take.

has APISports::Football::HTTPClient $.http_client
What does the APISports::Football::HTTPClient mean to a user of your module?

It's a publicly accessibly part of the API -- why would it not be relevant to the user?

lizmat · 2024-09-15T11:53:54Z

For example, if #= outside of signatures is causing trouble, I would wholeheartedly endorse removing that option. On the other hand, if it's not a source of problems then it is probably safe to keep.

The problem with #= is, is that it needs to be attached to the last declaration. Now, from a grammar point of view, this can be tricky. Because all comments, including declarator docs, are considered to be whitespace internally. For instance:

class  #= foo
  A    #= bar
{      #= baz
    ...
}      #= zippo

Does the declaration start with class? Or after the name? Or after the opening {. Or after the closing }?

In the Raku grammar, only baz will be attached. foo and bar will generate a warning about not being able to find a declarator. The zippo will be silently ignored.

In the legacy grammar these three would all be silently ignored.

Now clearly this is an artificial example. But when you realize that parameters and blocks can have declarator docs:

sub foo(
  Int $a where { $_ > 1 }  #= foo
) { }

In the Raku grammar, the "foo" is attached to the where { } block, not to the parameter. In the legacy grammar, the "foo" isn't attached to anything.

My point: the "last declarator" rule is not very transparant.

FWIW, the "next declarator" isn't either.

#| foo
sub
#| bar
foo(Int $a) { }

The "foo" is attached to the sub, the "bar" is attached to the parameter (both in legacy grammar and Raku grammar).

Again, this is artificially constructed, but I hope it shows that kind of hoops the grammar needs to jump through to get something because all comments are just whitespace.

And you can argue, this is a case of DIHWIDT, but I doubt whether a developer will check whether the documentation they thought they added, is showing up at the right place, or at all.

In other words: it is all too magic.

lizmat · 2024-09-15T11:59:05Z

@CIAvash

What does the APISports::Football::HTTPClient mean to a user of your module?
It probably should link to the documentation of the class, if it is usable by the user. So I think the problem lies in the tools.

How can the tool determine whether something is supposed to be usable by the user? Most developers don't put a my in front of their class definitions, which means it is a publicly visible class. But that doesn't mean it is supposed to be used by itself? So this would require more discipline in the developer to mark these classes as internal. Only then could a tool decide not to mention the type in that argument.

ab5tract · 2024-09-15T12:10:32Z

Thank you for clarifying. The current rules do indeed seem way too loosey-goosey for even our project's threshold of implementor-torment :)

I doubt we would see any significant regressions in the ecosystem if we scoped #| to only refer to package and routine definitions and for #= to only apply to parameters. Even package "decks" could likely be tossed out without much impact. Any usage outside of definitions would be ignored, ie:

# this does indeed seem like a step too far to me
#| get all the whys
sub 
#| this form of cruelty will be ignored, worry'd, or sorry'd
why {
    { .WHY } #= why??? ... just use a comment!
        for @_;
}

In JavaDoc, they get to add some module-implementor torment by forcing awkward syntax that goes above the routine definition where the signature parameters need to be individually maintained and their types spelled out by hand. It also visually breaks the flow when reading through code.

But when it comes to IDE tooltip integration or just plain using the generated HTML documentation to actually work with what the code provides, it's immensely helpful.

I think the "decks" are a huge step above this and deliver the same functionality with way less maintenance and visual disruption.

CIAvash · 2024-09-15T12:14:26Z

How can the tool determine whether something is supposed to be usable by the user?

Probably needs to be done using configs? Maybe one config for the whole document and individual configs if some part of the code needs it.

Rustdoc for example uses some configs for hiding documentation and doing other things. More info on Rustdoc

lizmat · 2024-09-15T12:45:59Z

re:

#| get all the whys
sub 
#| this form of cruelty will be ignored, worry'd, or sorry'd
why {
    { .WHY } #= why??? ... just use a comment!
        for @_;
}

This "this form of cruelty will be ignored, worry'd, or sorry'd" attaches to the block of the why call. Why you may ask? And knowing the Raku grammar a bit, it will be very hard to fix. Because, as I said: it's whitespace. And apart from the fact that in the grammar this whitespace is traversed multiple times (hence a quite elaborate system of handling that declarator doc only once), during the parsing of Raku code, it is quite unclear where things need to be attached to.

lizmat · 2024-09-15T12:48:21Z

Tangent: I've always felt like .WHY was an under-utilized magic feature for the REPL. In a perfect world, I would love to see everything in CORE.setting have a reasonable response to this method call.

This is a different issue. In a REPL or an IDE, linking from an object to the appropriate documentation, should be a separate project. Putting the docs as declarator docs in the core, would not be a solution.

ab5tract · 2024-09-15T12:49:06Z

Tangent: I've always felt like .WHY was an under-utilized magic feature for the REPL. In a perfect world, I would love to see everything in CORE.setting have a reasonable response to this method call.

This is a different issue. In a REPL or an IDE, linking from an object to the appropriate documentation, should be a separate project. Putting the docs as declarator docs in the core, would not be a solution.

I think that is a matter for discussion, rather than a settled fact.

EDIT: But it's literally a tangent. Let's not worry about it here or now.

ab5tract · 2024-09-15T13:03:40Z

re:
#| get all the whys
sub 
#| this form of cruelty will be ignored, worry'd, or sorry'd
why {
    { .WHY } #= why??? ... just use a comment!
        for @_;
}
This "this form of cruelty will be ignored, worry'd, or sorry'd" attaches to the block of the why call. Why you may ask? And knowing the Raku grammar a bit, it will be very hard to fix. Because, as I said: it's whitespace. And apart from the fact that in the grammar this whitespace is traversed multiple times (hence a quite elaborate system of handling that declarator doc only once), during the parsing of Raku code, it is quite unclear where things need to be attached to.

I'm a bit confused, sorry. My proposals were:

A) process "decks" in the grammar differently than other RakuDoc syntax
B) the implementation refuses to attach "decks" to anything other than routines (or possible packages)

Maybe A is not possible. But B seems to preclude what you are saying in your response? How would it get attached if the implementation is designed to ignore, complain, or outright die when such an attachment is attempted?

lizmat · 2024-09-15T13:09:21Z

Getting back to @CIAvash's example:

What I miss in the current RakuDoc, is a simple way to render the signature of a subroutine or a method without any additional comments.

Something like:

sub frobnicate(Int:D frobnicatee, :$hammer) { ... }
...
=begin rakudoc
=head2 :signature<&frobnicate>
The C<foo> subroutine frobnicates its positional argument, possibly hammering it with the C<:hammer> named argument.

that would render to something like (in markdown):

## subroutine "frobnicate"
* required positional argument #1: `Int`
* optional named argument: `hammer`
The `foo` subroutine frobnicates its positional argument, possibly hammering it with the `:hammer` named argument.

lizmat · 2024-09-15T13:14:03Z

@ab5tract sorry, got confused / distracted.

A) process "decks" in the grammar differently than other RakuDoc syntax

It already does? Because declarator docs can appear at any place in the code where there is whitespace. They can not appear in whitespace inside "docks".

B) the implementation refuses to attach "decks" to anything other than routines (or possible packages)

That would severely limit usefulness, especially when documenting CLI arguments in scripts.

ab5tract · 2024-09-15T13:23:36Z

@ab5tract sorry, got confused / distracted.

A) process "decks" in the grammar differently than other RakuDoc syntax

It already does? Because declarator docs can appear at any place in the code where there is whitespace. They can not appear in whitespace inside "docks".

For some reason it seems that you have missed that all of my suggestions are around significantly curtailing the appropriate locations for #| and #=.

What I'm proposing for the grammar (and which I appreciate might not be possible) is to not treat these as whitespace. They would specifically be optional captures for routines (#|) and parameters (#=).

B) the implementation refuses to attach "decks" to anything other than routines (or possible packages)

That would severely limit usefulness, especially when documenting CLI arguments in scripts.

Please re-read my earlier message. In item B, I was referring specifically to #|. The behavior of #= would be restricted to refer to parameters only (so no blocks, no random variables in random scopes, nothing besides to the right of a parameter declaration inside of a signature).

Also, I don't see how there is any usefulness in any context under your proposal to remove them entirely?

lizmat · 2024-09-15T14:19:57Z

I doubt we would see any significant regressions in the ecosystem if we scoped #| to only refer to package and routine definitions and for #= to only apply to parameters.

@ab5tract I assume I missed the meaning of "scoped" here?

ab5tract · 2024-09-15T14:42:55Z

I doubt we would see any significant regressions in the ecosystem if we scoped #| to only refer to package and routine definitions and for #= to only apply to parameters.

@ab5tract I assume I missed the meaning of "scoped" here?

I meant in the sense of narrowing down, in this case meaning "only define the concept/allow the parser to accept in this narrowed conception of #| and #=".

lizmat · 2024-09-15T14:47:04Z

@ab5tract OK, gotcha now.

Maybe a first step could be simpler:

#| is only allowed at the start of a line
#= is only allowed when it is not at the start of a line
extended forms #|( ... ) and #=( ... ) to be disallowed

finanalyst · 2024-09-15T15:58:31Z

Sorry for the delay in responding. Just seen this thread. Working from phone, so I hope comment correctly attached to thread. Rather than respond to direct questions, here are some considerations: - The intimate connection between Rakudoc and code means Rakudoc can only be handled by Raku and a BEGIN expression may be run. For this reason, hosts like GitHub and raku.land will not generate HTML on the fly from RakuDoc. - RakuDoc V2 explicitly distinguishes between elements for IDEs and elements for documentation. Decks (using abstract 's term - sorry for autocorrect) are part of the IDE division. However, RakuDoc also explicitly states that semantic blocks should be available for other tools, and they can be 'moved' around (eg. Declared at the beginning of a source but rendered later.) - One of the meta aims of RakuDoc IMHO seems to have been to provide a mechanism to handle many of the suggestions of literate programming. This implies a close connection between the code and the documentation it concerns. I have not seen any development of this idea though. - I have used #= and #| ALOT!!! Since I use Comma and Comma pops up the explanation of a variable to which #| is attached, I find this quite useful as I develop a distribution. But that is a use of RakuDoc inside an IDE. - I find the ability to attach #= to variables inside a sub MAIN very!!!!! useful. I'll forget the parameters for CLI and they are automatically available. - But I am frustrated by the limitations of #| . An important structure for me is a hash and I use them in config situations. I cannot attach a #| to a key. I would like to generate some documentation that extracts comments about keys of a hash, because I comment new keys, but can't remember them all. - while =finish has not been mentioned yet, it is also a part of RakuDoc and is code oriented rather than documentation oriented. I have found =finish to be useful particularly in tests, where sample input can be placed. Some questions. - Can we separate completely everything inside a =rakudoc block so that it is available before any bytecode has been generated? - this will mean that we will have to remove A<> markup from being able to provide the value of a Raku variable. This is a part of the spec of both POD6 and RakuDoc V2 but no one has ever used it. - If a deck is used, then its values are only available after bytecode has been created, and so would not be available for a renderer. Please ask questions if this is not clear and I'll respond when I get back online later today. Richard

…

On Sun, 15 Sept 2024, 15:47 Elizabeth Mattijsen, ***@***.***> wrote: @ab5tract <https://github.com/ab5tract> OK, gotcha now. Maybe a first step could be simpler: 1. #| is only allowed at the start of a line 2. #= is only allowed when it it *not* at the start of a line 3. extended forms #|( ... ) and #=( ... ) to be disallowed — Reply to this email directly, view it on GitHub <#438 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACYZHDQLASHIWFTB2FZE4TZWWMX5AVCNFSM6AAAAABOFKPHUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJRGYZDGOBUHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

thoughtstream · 2024-09-16T02:27:44Z

There have been a lot of valid and useful points made on both sides of the argument.
It seems clear to me that:

Users do want to be able to annotate elements of their code in a way that is subsequently accessible to both that code and to other tools (such as an IDE).
Using special forms of a comment to do that was probably a mistake, given that comments are treated as just an unusual kind of whitespace by the compiler.
We may be neglecting a better, already available mechanism that might solve both these problems.

Because it occurs to me that Raku already has a mechanism for associating information
with a value or a declared object: traits.

So, instead of specifying that two particular kinds of comment have special meaning
in a limited range of contexts, why would we not just specify that there is a docs trait
(or perhaps why or nb or desc):

    class Magician docs 'Base class for magicians' {
        has Int $.level;
        has Str @.spells;
    }

    sub duel docs ('Fight mechanics', 'Magicians only, no mortals') (
        Magician $a  docs 'The first magician in the duel',
        Magician $b  docs 'The second magician in the duel',
    ) {
        ...
    }

    my $mage  docs 'A magician of level 2 or above';

    say Magician.WHY;       # OUTPUT: «Base class for magicians»
    say &duel.WHY.leading;  # OUTPUT: «Fight mechanics»
    say &duel.WHY.trailing; # OUTPUT: «Magicians only, no mortals.»

And perhaps even a docs operator to cater to @finanalyst’s desire for annotated hash keys:

    my %config =
        'size'  docs 'Max size of entry'  =>  42,
        'limit' docs 'Apply limiting'     =>  True,
        'rand'  docs 'Randomizes lookup'  =>  True,
        'etc'   docs 'Et cetera'          =>  'et cetera';

It’s not as convenient to the coder as #| and #= declarator blocks,
but it would probably be a heck of a lot easier to implement.

patrickbkr · 2024-09-19T08:09:39Z

Let's start a wish list of elements that should support Decks.

package / module / class / grammar / monitor
proto / method / submethod / sub / regex / token / rule
signature parameters
has / my / our / constant / state / anon / ...
Pairs
enum / enum values
subset

Questions:

Is monitor challenging given it lives in module space?
It's possible to have standalone Signatures and Pairs (i.e. not part of a routine / hash variable). If one adds a Deck to such a thing it's not obvious where to put such Docks when creating a hierarchy of all the Docks in some module.

It occurs to me that one possible solution to preserving both declarator blocks and @lizmat's sanity (;-) would be to rethink what declarator blocks actually are.

What if #| and #= were not comments at all? What if they were actually an optional component (a "Deck"?) of various declarative constructs?

I like this idea.

finanalyst · 2024-09-19T08:25:14Z

@patrickbkr Here's my take on your responses. @lizmat will probably correct me.

When I wrote RakuAST I was referring to the major rewrite of Rakudo that is currently underway. It is not complete yet because Rakudo.E does not yet pass all tests. RakuAST creates an AST of the program. The legacy version of Rakudo (there must be a better way of naming these two things) does not produce an AST.
Raku is a language, but it is distinguished from the compiler. We now have Rakudo.D and Rakudo.E (which is what I was referring to as RakuAST), which are compilers.
Declarator Docs (Decks) are formally a part of RakuDoc - though perhaps they should not be
The safety issue comes from the execution of byte code. Raku requires byte code to be executed during compilation.
RakuDoc is specified as part of Raku, and the intention was to make documentation as closely related to coding as possible. The intent was IMHO an effort to bring into Raku some of the concepts of literate programming. But this intent also brings with it the ability to write malicious code that can enter the documentation. I do not think this was appreciated when RakuDoc (aka POD6) was designed. (@thoughtstream forgive me if I am wrong about this)
@lizmat has now changed the situation with the implementation of RakuAST. It is now possible to generate the AST of a Raku program without the generation of byte code. This means that the RakuDoc components of a Raku program (and a RakuDoc source is a Raku program) can be extracted and manipulated by a renderer without ever generating byte code, and therefore be considered safe.

We need to develop some terms to make these distinctions better. I am fairly certain I have not been as clear as I should have been

patrickbkr · 2024-09-19T09:22:17Z

When I wrote RakuAST I was referring to the major rewrite of Rakudo that is currently underway. It is not complete yet because Rakudo.E does not yet pass all tests. RakuAST creates an AST of the program. The legacy version of Rakudo (there must be a better way of naming these two things) does not produce an AST.

Agreed.

Raku is a language, but it is distinguished from the compiler. We now have Rakudo.D and Rakudo.E (which is what I was referring to as RakuAST), which are compilers.

Agreed.

Declarator Docs (Decks) are formally a part of RakuDoc - though perhaps they should not be

Agreed.

The safety issue comes from the execution of byte code. Raku requires byte code to be executed during compilation.

Agreed.

RakuDoc is specified as part of Raku, and the intention was to make documentation as closely related to coding as possible. The intent was IMHO an effort to bring into Raku some of the concepts of literate programming. But this intent also brings with it the ability to write malicious code that can enter the documentation. I do not think this was appreciated when RakuDoc (aka POD6) was designed. (@thoughtstream forgive me if I am wrong about this)

Agreed.

@lizmat has now changed the situation with the implementation of RakuAST. It is now possible to generate the AST of a Raku program without the generation of byte code. This means that the RakuDoc components of a Raku program (and a RakuDoc source is a Raku program) can be extracted and manipulated by a renderer without ever generating byte code, and therefore be considered safe.

(Quick shout out to jnthn, nine, ab5tract and probably others that all have poured in work on RakuAST.)

This is sadly not true. Even in the new RakuAST compiler bytecode generation and execution can in part already happen during the parse / AST generation phase. The classic example being BEGIN blocks which are compiled and executed by the compiler as soon as the parser sees the end of the block. At that point in time the parser hasn't even looked at the text of the input file following that BEGIN block yet. This might be a little oversimplified, but in principle right.

@lizmat Agreed?

lizmat · 2024-09-19T09:29:36Z

Agreed.

niner · 2024-09-19T14:50:45Z

This is sadly not true. Even in the new RakuAST compiler bytecode generation and execution can in part already happen during the parse / AST generation phase. The classic example being BEGIN blocks which are compiled and executed by the compiler as soon as the parser sees the end of the block.

They can, but they don't have to and they do much less often than with the old frontend. Reason is that RakuAST includes infrastructure for interpreting ASTs directly. We use this to avoid the costly bytecode generation+load for trivial expressions. The most notable exception here is role bodies because they generate a lexical context. But maybe we can find an alternative to that.

finanalyst · 2024-09-20T15:08:23Z

[update after @niner's comment below]
I was wondering about this and wrote a small program with a BEGIN and rakudoc block to see how getting an AST from it would differ from using EVAL on it.
Unfortunately, it hits an error that I'm not sure how to deal with.
Test program contents (program in file begin_ast.raku)

use experimental :rakuast;

my $prog = q:to/PROG/;
    BEGIN { say 'in BEGIN phase: here lie dragons' }
    =begin rakudoc
    In a Rakudoc block, dragons lie here sleeping
    =end rakudoc
    PROG

say 'Before Eval';
use MONKEY-SEE-NO-EVAL;
EVAL $prog;
no MONKEY-SEE-NO-EVAL;

say 'Before AST evaluation';
say $prog.AST.rakudoc;
say 'ending test';

Output in terminal:

$ raku tmp/begin_ast.raku 
Before Eval
in BEGIN phase: here lie dragons
Before AST evaluation
===SORRY!===
Unknown compilation input 'qast'
$ raku -v
Welcome to Rakudo™ v2024.08-59-gb6fa27a22.
Implementing the Raku® Programming Language v6.d.
Built on MoarVM version 2024.08-6-gac82e446f.

commenting out the BEGIN expression (which defeats the purpose, but shows the expected behaviour from say'ing the AST) yields the following:

$ raku tmp/begin_ast.raku 
Before Eval
Before AST evaluation
(RakuAST::Doc::Block.new(
  type       => "rakudoc",
  paragraphs => (
    "In a Rakudoc block, dragons lie here sleeping\n",
  )
))
ending test

Update

Here is the output as suggested below by @niner:

$ RAKUDO_RAKUAST=1 raku tmp/begin_ast.raku 
Before Eval
in BEGIN phase: here lie dragons
Before AST evaluation
in BEGIN phase: here lie dragons
(RakuAST::Doc::Block.new(
  type       => "rakudoc",
  paragraphs => (
    "In a Rakudoc block, dragons lie here sleeping\n",
  )
))
ending test
$ raku -v
Welcome to Rakudo™ v2024.08-66-gc3fbe0c3c.
Implementing the Raku® Programming Language v6.d.
Built on MoarVM version 2024.08-10-gb7750ec26.

Commentary: This means that at present BEGIN expressions do make things unsafe even with the RakuAST compiler and front-end.

lizmat · 2024-09-20T22:15:07Z

@finanalyst There's currently a bug in the handling of BEGIN phasers that only occurs when creating an AST with .AST.

finanalyst · 2024-09-21T09:38:36Z

@lizmat That's a relief. I thought I got the code wrong.

@patrickbkr When the RakuAST parser/compiler bug is fixed, and assuming that the say inside the BEGIN is not executed, would this mean the safety issue is fixed in RakuAST when processing documentation?

niner · 2024-09-21T10:08:19Z

It's not really a bug. It's just that the RakuAST frontend does not support BEGIN time compilation with the old frontend. If you run your example with RAKUDO_RAKUAST=1 you've got a decent chance of it doing what you intend.

finanalyst · 2024-09-21T19:51:30Z

@niner You were right - I've amended my comment above.

I hadn't quite understood the difference between

use experimental :rakuast in a program abc.raku and then running raku abc.raku
not including use experimental :rakuast in abc.raku and then running RAKUDO_RAKUAST=1 raku abc.raku

lizmat · 2024-09-22T11:47:37Z

What if #| and #= were not comments at all?
What if they were actually an optional component (a "Deck"?)
of various declarative constructs?

Then we could add them to the syntax for those constructs
in the restricted locations that @lizmat is hoping for.

The trick would be to redefine the rule for comments so that the '#' introducer token
becomes '#' <!before '|' | '='>, which would mean that leading and trailing decks
are no longer skipped as being whitespace, and can only appear in the places
where the Raku grammar explicitly allows them.

I like this idea a lot!

This will however be potentially non-trivial to implement. But still better than the current situation.

And it will probably break possibly quite a few spectests that depend on the "decks are whitespace" semantics.

Should I put this in a pull request and get a vote on that?

lizmat · 2024-09-22T17:16:37Z

Any way to have it both ways via some switch?

@tbrowder You mean "as whitespace" vs "only at specific locations"?

patrickbkr · 2024-09-22T19:09:08Z

@patrickbkr When the RakuAST parser/> compiler bug is fixed, and assuming that the say inside the BEGIN is not executed, would this mean the safety issue is fixed in RakuAST when processing documentation?

Just to clarify: The fact that Raku runs BEGIN blocks during the parse is happening deliberately and is necessary, because it is allowed to modify the parser state in BEGIN. A good example is the OO::Monitors module, which is adding a new keyword `monitor`. The parser needs to run that modules code for it to recognize the "monitor" keyword. If it wouldn't, the parser could not successfully parse any code using the "monitor" keyword.

tbrowder · 2024-09-22T19:28:26Z

Any way to have it both ways via some switch?

@tbrowder You mean "as whitespace" vs "only at specific locations"?

I think I misunderstood. So the result would be the decl blocks would stay as the user defined them?

finanalyst · 2024-09-22T21:13:13Z

@patrickbkr
A good example is the OO::Monitors module, which is adding a new keyword monitor

Suppose it is possible to prevent the creation of bytecode, and renderer is given an AST with only rakudoc blocks, then the generation of output - including HTML - depends only on the trusted code that changes blocks into output. This trusted code is not affected by the source program.

Does this have a safety concern?

patrickbkr · 2024-09-23T05:15:07Z

Consider: ``` use OO::Monitors; #|Doc my $var; monitor M { } #|Doc2 my $var2; ``` Without running the code in OO::Monitors, the parser can't understand the `monitor M { }` line. The parse will fail and it's not possible to produce the AST. You can't skip running any code during the parse as running that code is often a necessity for the parser to be able to continue parsing. So in the above code the parser - working top to bottom - doesn't even reach the `#|Doc2` comment on line 8 as it already fails on line 6.

…

On September 22, 2024 11:13:36 PM GMT+02:00, Richard Hainsworth ***@***.***> wrote: > @patrickbkr > A good example is the OO::Monitors module, which is adding a new keyword `monitor` Suppose it is possible to prevent the creation of bytecode, and renderer is given an AST with only rakudoc blocks, then the generation of output - including HTML - depends only on the trusted code that changes blocks into output. This trusted code is not affected by the source program. Does this have a safety concern? -- Reply to this email directly or view it on GitHub: #438 (comment) You are receiving this because you were mentioned. Message ID: ***@***.***>

thoughtstream · 2024-09-25T02:11:08Z

I’ve been thinking about how we can make the proposed non-whitespace #| and #=
more syntactically “regular”...and hopefully much easier to implement as well.

In looking at the current specification, it seems obvious that the
main problem is with the trailing #= documentor. Because, even
if we restrict where it can appear, people are still going to want to
place that construct in syntactically inconsistent places in the grammar:

    class Base {            #= Example 1
        ...
    }

    class Der               #= Example 2
    is Base
    {
        method action (     #= Example 3
            $argie,         #= Example 4
            $bargie         #= Example 5
        )
        {...}
    }


    my Int $answer = 42;    #= Example 6

Conceptually, what developers want is to be able to document a construct “on the left”.

But, syntactically, that sometimes means “immediately after the declarand itself” (as in examples 2 and 5),
while at other times it means “before the following component (as in example 4),
or “inside the following component” (as in examples 1 and 3),
or even "after the entire statement” (as in example 6).

If we want to support all those locations with the new non-whitespace #= documentor,
that is going to significantly complicate the entire Raku grammar, and especially
the AST construction process.

In contrast, it would be relatively easy to handle the leading #| documentor.
Specifically, we could define that a #| is only permitted immediately before
a keyword_opt / type_opt / declarand_req sequence:

    #| Example 7
    class Base {
        ...
    }

    #| Example 8
    class Der is Base
    {
        #| Example 9
        method action (

            #| Example 10
            $argie,

            #| Example 11
            $bargie
        )
        {...}
    }


    #| Example 12
    my Int $answer = 42;

That seems to already allow virtually all of the current (sane) uses of #|.

So, in order to make #= equally implementable from a syntactic point of view,
and more predictable and teachable for end-users, perhaps we could specify that
a #= can only appear immediately after the same keyword_opt / type_opt / declarand_req sequence.

Which would give us:

    class Base          #= Example 13
    {
        ...
    }

    class Der           #= Example 14
    is Base
    {
        method action   #= Example 15
        (
            $argie      #= Example 16
            ,
            $bargie     #= Example 17
        )
        {...}
    }


    my Int $answer      #= Example 18
        = 42;

That’s not ideal perhaps (example 16 is particularly unappealing, and examples 14 and 18
aren’t ideal either) but it would be a great deal more achievable and predictable.
And those suboptimal cases could be made a little less awkward assuming we also
provide a bracketed #= form:

    class Der  #=[ Example 19 ]  is Base
    {
        method action (
            $argie      #=< Example 20 >,
            $bargie     #=< Example 21 >
        )
        {...}
    }


    my Int $answer  #=「Example 22」  = 42;

Of course, we probably also need to support multiple consecutive documentors in those locations:

    #| This is the base class
    #| for everything in the system

    class Base   #= I<It really needs
                 #=   a much better name
                 #=   of course>
    {...}

The only other element we’ve discussed and which that this approach doesn’t handle
is the ability to document non-declarative components, such as the keys of a hash.

Personally, I think we should defer adding that until we see how the proposed changes
to documentors for declarands shakes out, but if we did want to tackle it now,
I’d suggest that we simply specify that a documentor that doesn’t appear immediately
before/after a declarand can only appear immediately before/after a compile-time literal value,
to which it is then attached. For example:

    my %preconfig =
        #| Max size of entry
        size => 42,

        #| Apply limiting
        limit => True,

        #| Randomizes lookup
        rand => True,

        #| Et cetera
        etc => 'et cetera';

    my %postconfig =
        'size'  #=[ Max size of entry ]  =>  42,
        'limit' #=[ Apply limiting    ]  =>  True,
        'rand'  #=[ Randomizes lookup ]  =>  True,
        'etc'   #=[ Et cetera         ]  =>  'et cetera';

Note that in the case of a trailing documentor for hash keys, the key has to be
explicitly quoted, because it’s now syntactically separated from the subsequent
=> and hence no longer autoquoted.

patrickbkr · 2024-09-25T07:33:17Z

Specifically, we could define that a #| is only permitted immediately before a keywordopt / typeopt / declarandreq sequence
...
So, in order to make #= equally implementable from a syntactic point of view, and more predictable and teachable for end-users, perhaps we could specify that a #= can only appear immediately after the same keywordopt / typeopt / declarandreq sequence.

I do like the idea to limit the positions the Decks can appear in.

But I think we are stretching the #| / #= syntax too far. They look like comments, but actually share little behavior given they can appear only in very specific places, always in direct relation to a bordering element.

Comparing this latest proposal to the earlier idea to utilize a docs trait, the trailing places they can be put in are (almost?) identical. So why not just go with the trait approach then?
I don't think we have leading traits yet. Would that be doable? If yes, we could do:

docs 'Base class for magicians' 
class Magician {
    has Int $.level;
    has Str @.spells;
}

docs 'Fight mechanics ' ~
     'Magicians only, no mortals'
sub duel (
    Magician $a  docs 'The first magician in the duel',
    Magician $b  docs 'The second magician in the duel',
) {
    ...
}

my $mage  docs 'A magician of level 2 or above';

I don't like the

docs 'Fight mechanics ' ~
     'Magicians only, no mortals'

bit. But maybe we can come up with a nicer approach to have multiline strings there.

jubilatious1 · 2024-09-25T08:27:39Z

Docks?

Disagree with using/implementing this terminology. Sounds too much like Docs.

patrickbkr · 2024-09-26T10:18:53Z

Docks?

Disagree with using/implementing this terminology. Sounds too much like Docs.

Just a typo on my side. Corrected.

jubilatious1 · 2024-10-03T16:25:41Z

softmoth/raku-Pod-To-Markdown#18 (comment)

bbkr · 2024-10-08T10:45:32Z

I like and frequently use declarator blocks in my everyday code. They are great way of communicating with developers. Comparing to regular POD:

=begin pod           <- line noise

=head3 foo           <- easy to miss when renaming method, requires informal headX standard to be used across company

Blabla               <- actual, useful content

=end pod             <- line noise

method foo { ... }

BTW: I never liked POD, both in Perl and Raku. When I first encountered Rust approach with

    // - Regular comment
    /// - Generate library docs for the following item.
    //! - Generate library docs for the enclosing item.

I was amazed how easy and consistent documentation can become across entire ecosystem. Here is example of my module using this concept. No line noise, no special documentation syntax clutter, just pure information.

So for me class/method declarator blocks are the way to go and POD can be removed entirely from core and/or moved to some external library.

tbrowder · 2024-10-08T15:49:35Z

I appreciate your point of view, but I love Raku pod (Rakudoc) and find it easy to use (as opposed to Perl pod).

I speak as a regular Perl user since 1993 and Raku user since 2015. (Note RakuAST is making Rakudoc even better.)

And we have tools to easily convert Rakudoc to other file forms like Markdown, html, and PDF (and other types I don't currently use). Conversely, we also have tools to convert Rakudoc to Markdown, and there are many other non-Raku tools to convert Markdown to some other document form.

Finally, Rakudoc has a much richer, and extensible, syntax which enables almost unlimited enhancement and variety of output PDF and html products.

doomvox · 2024-11-15T22:18:27Z

I'm late to the party but I wanted to say that in general I like
things like declarator docs quite a bit (unlike lizmat and
thoughtstream, I'm definitely in favor of embedded documentation).
But It does seem that this is an area where an extreme commitment to
backwards compatibility probably isn't necessary: changing some corner
cases to make the parsing problem saner is probably fine.

Myself, I actually haven't really used the declarator doc features
very much, but I think that's largely because they don't really seem
like a big part of Raku's programming culture.

In comparison, in the emacs lisp world, a typical function definition
might look like:

(defun ourproject-do-something (argument)
  "Do something useful with ARGUMENT for ourproject."
  (message "doing stuff with: %s" argument))

The Raku equivalent would be something like:

#! Do something useful with 'argument' for ourproject
sub do-something ($argument) {
  say "doing stuff with $argument";
}

The elisp docstrings, while technically optional, are required by the
programming culture. Emacs has a "help" system that displays these
docstrings (similar to the case of the comma IDE that finanalyst was
talking about). An elisp code example wouldn't be complete without
the docstring, though in the case of Raku they're uncommon.

It's probably significant that in the case of elisp, the syntax makes
it seem like the docstring is part of the routine, where the raku
version makes it seem like it's external to it: I think something like
the "docs" feature thoughtstream proposed seems very interesting, it
would make the docstrings seem like they're internal to the routines,
and might actually encourage their use.

But that said, while the "docs" syntax might be a good "in addition
to" my own feeling is it wouldn't work so well as an "instead of".
For good or for ill, we should probably stick with the magic comments
in some form.

lizmat added the language Changes to the Raku Programming Language label Sep 13, 2024

This comment was marked as resolved.

Sign in to view

lizmat mentioned this issue Oct 20, 2024

Several POD issues? rakudo/rakudo#5240

Closed

Declarator Docs should be limited in scope #438

Declarator Docs should be limited in scope #438

Comments

lizmat commented Sep 13, 2024 • edited Loading

lizmat commented Sep 13, 2024

lizmat commented Sep 13, 2024

lizmat commented Sep 13, 2024

lizmat commented Sep 13, 2024

This comment was marked as resolved.

lizmat commented Sep 14, 2024

This comment was marked as resolved.

thoughtstream commented Sep 15, 2024

CIAvash commented Sep 15, 2024

ab5tract commented Sep 15, 2024

lizmat commented Sep 15, 2024 • edited Loading

CIAvash commented Sep 15, 2024

ab5tract commented Sep 15, 2024

lizmat commented Sep 15, 2024 • edited Loading

lizmat commented Sep 15, 2024

ab5tract commented Sep 15, 2024

CIAvash commented Sep 15, 2024 • edited Loading

lizmat commented Sep 15, 2024

lizmat commented Sep 15, 2024

ab5tract commented Sep 15, 2024 • edited Loading

ab5tract commented Sep 15, 2024 • edited Loading

lizmat commented Sep 15, 2024 • edited Loading

lizmat commented Sep 15, 2024 • edited Loading

ab5tract commented Sep 15, 2024

lizmat commented Sep 15, 2024

ab5tract commented Sep 15, 2024

lizmat commented Sep 15, 2024 • edited Loading

finanalyst commented Sep 15, 2024 via email

thoughtstream commented Sep 16, 2024

patrickbkr commented Sep 19, 2024

finanalyst commented Sep 19, 2024

patrickbkr commented Sep 19, 2024

lizmat commented Sep 19, 2024

niner commented Sep 19, 2024

finanalyst commented Sep 20, 2024 • edited Loading

Update

lizmat commented Sep 20, 2024

finanalyst commented Sep 21, 2024

niner commented Sep 21, 2024

finanalyst commented Sep 21, 2024

lizmat commented Sep 22, 2024

lizmat commented Sep 22, 2024

patrickbkr commented Sep 22, 2024 via email

tbrowder commented Sep 22, 2024

finanalyst commented Sep 22, 2024

patrickbkr commented Sep 23, 2024 via email • edited Loading

thoughtstream commented Sep 25, 2024

patrickbkr commented Sep 25, 2024 • edited Loading

jubilatious1 commented Sep 25, 2024 • edited Loading

patrickbkr commented Sep 26, 2024

jubilatious1 commented Oct 3, 2024

bbkr commented Oct 8, 2024 • edited Loading

tbrowder commented Oct 8, 2024 • edited Loading

doomvox commented Nov 15, 2024

lizmat commented Sep 13, 2024 •

edited

Loading

lizmat commented Sep 15, 2024 •

edited

Loading

lizmat commented Sep 15, 2024 •

edited

Loading

CIAvash commented Sep 15, 2024 •

edited

Loading

ab5tract commented Sep 15, 2024 •

edited

Loading

ab5tract commented Sep 15, 2024 •

edited

Loading

lizmat commented Sep 15, 2024 •

edited

Loading

lizmat commented Sep 15, 2024 •

edited

Loading

lizmat commented Sep 15, 2024 •

edited

Loading

finanalyst commented Sep 20, 2024 •

edited

Loading

patrickbkr commented Sep 23, 2024 via email •

edited

Loading

patrickbkr commented Sep 25, 2024 •

edited

Loading

jubilatious1 commented Sep 25, 2024 •

edited

Loading

bbkr commented Oct 8, 2024 •

edited

Loading

tbrowder commented Oct 8, 2024 •

edited

Loading